Category Archives: Storage

DRBD Performance (draft)

DRBD can be used as RAID 1 but still couldn’t perform as well. More than RAID1, its performance rely strongly on hardware performance.

Let’s break DRBD replication into steps:

1. Data transfer through network:

Plus having a good bandwidth, DRBD documentation suggest some parameter to fine tune the net performance such as:

– The rate of synchronization,

– Using checksum-based synchronization,

– Replication modes,

2. Read/write data on the disk:

When I deployed DRBD, I had 2*8To SSD and gigabyte ethernet card and still not statisfied. To tackle I/O latency, you should consider at least two parameter:

– The I/O scheduler as it comes between DRBD binary and the disk:

The official docs suggest using the deadline I/O scheduler, although I think DRBD replication(RAID1 like) uses extensive write operations and deadline does prefer read operations. For me, I’ve chosen no I/O scheduler because I used SSD disks and kernel I/O schedulers are here for HDD only xd.

– The read/write speed of the disk:

I used Bonnie++ in order to do a benchmark of the disks I may use, SSD, although are expensive,  are very good.

Advertisements

Replicate your data using DRBD (draft)

Some programs just don’t include replication as an option. DRBD is then a very good way to replicate your service/data transparently.

Just have your service and configuration files/data run on a disk, and deploy DRBD to replicate the disk on another server. You can use heartbeat or corosync along to assure high availability to your system.

What’s amazing is that DRBD is OPEN SOURCE.

To deploy DRBD, on both nodes simply follow:

– ReCompile and build your kernel with DRBD support:

I’ve tested DRBD on Gentoo distribution,

# cd /usr/src/linux-3.10.7-gentoo/
linux-3.10.7-gentoo # make menuconfig

scripts/kconfig/mconf Kconfig
#
# using defaults found in /boot/config-3.4.45-sdf134-core2-64
#

*** End of the configuration.
*** Execute 'make' to start the build or try 'make help'.

DRBD
Compile and build your binary kernel, then copy it and link it to where your boot-loader is configured to look
for the kernel.

 # cat /proc/cpuinfo
linux-3.10.7-gentoo # time make -j4
linux-3.10.7-gentoo # make modules_install
linux-3.10.7-gentoo # cp arch/x86/boot/linux-3.10.7 /boot/
#reboot

– Synhronize time with an NTP server:

</pre>
emerge -tav <a class="external text" href="http://packages.gentoo.org/package/net-misc/ntp" rel="nofollow">net-misc/ntp</a>

and follow Gentoo wiki.

– Configure network connectivity on both nodes.

I suggest using dedicated network interface for DRBD. Then use hosts file so both nodes could connect  to and identify each other.

# echo '192.168.1.10 node1' >> /etc/hosts

Chek if no service is using port 7788 and 7799, no firewall rule is blocking in/out tcp connection
between nodes.

– Install DRBD control tools

</pre>
# emerge -tav sys-cluster/drbd

* IMPORTANT: 8 news items need reading for repository 'gentoo'.
* Use eselect news to read news items.

These are the packages that would be merged, in reverse order:

Calculating dependencies... done!
[ebuild  N     ] sys-cluster/drbd-8.4.2  USE="udev -bash-completion -heartbeat -pacemaker -xen" 660 kB

Total: 1 package (1 new), Size of downloads: 660 kB

Would you like to merge these packages? [Yes/No]

– Prepare your disks:

Partition your disk  and hand it empty to DRBD without a filesystem.

</pre>
# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.21.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help):

– Configure your resources:

DRBD is configured through /etc/drbd.d/global_common.conf and use /etc/drbd.conf.

The resource is configured through the file /etc/drbd.d/res_name.res.

Refer to DRBD Doc for more details.

– Start DRBD and create metadata on your disk

</pre>
# /etc/init.d/drbd start

#drbdadm create-md res_name0

#drbdadm primary res_name0  #Only on the primary node

You can then write on your DRBD device, format it and mount it, but you can do it only on the primary node.

– Watch DRBD resource synhronization using the proc file:

# cat /proc/drbd*

Troubleshooting?

It’s simple and straightforward, there is still one more thing you should consider: performance.

Monitoring DRBD status (draft)

You surely thought should I always check DRBD status by reading the bunch of line provided by:

cat /proc/drbd

DRBD monitoring is a must doing so you can ensure you’re data is consistent and uptodate. I worked with DRBD 8.4 and tried to figure out something that check the whole DRBD status.

I mean the folloing by conserving the order:

  1. Split brain
  2. Connection status
  3. Ressource Role
  4. Disk status
  5. I/O status
  6. Performance

I’m no DRBD expert, I just tried think something logical to my monitoring would be efficient and include no redundant alarm. And yes, we, mathematicians, adore optimization 🙂 :).

The flow of checks goes like in the following chart:

monitor drdbd logic

Distributed Replicated Block Device (DRBD)

I cannot be more explicit than the documentation.  But I like to report my own experience:) with DRBD.

If you’re looking for something to copy blindly your partition blocs and synchronize it with another partition without caring about the filesystem and data details, go for DRBD. it’s very cool, easy to implement, just get your disks and your gigabyte ethernet card ready :D.

DRBD stands between the filesystem and I/O scheduler layer of the operating system, thus didn’t care about high level layers like the filesystem. That’s why I said, it copies blindly xd xd.

DRBD

Depending on your linux distribution, you’ll get DRBD on your server. I did it on CentOS 6.

After having set my ressources, and started DRBD. I faced some difficulties mainly because I didn’t understand how DRBD works!!!

The idea was so simple but look at me, I was so hesitating kinda of couldn’t figure out some so very simple things like:

  • Do not know what to mount: device or disk? /dev/drbd1 or /dev/sda2?

When I needed to mount  my DRBD partition named as in the following configuration code,

resource r0 {
  on node1 {
    device    /dev/drbd1;
    disk      /dev/sda2;
    address   10.1.1.31:7789;
    meta-disk internal;
  }
  on node2 {
    device    /dev/drbd1;
    disk      /dev/sda7;
    address   10.1.1.32:7789;
    meta-disk internal;
  }
}

I tried:

mount /dev/sda2

Which is so stupid. Wanna know why? We only mount a filesystem to a point of the whole filesystem. Let’s
look at how DRBD sees /dev/sda2. in fact, filesystem or no filesystem on the partition is viewed by DRBD
as just an information about data on the partition. Plus, DRBD needs to store some meta data about the
replicated partition. By defaults, DRBD uses the same partition to save these metadata. So trying to mount
/dev/sda2 is something like mounting [metada format+filesystem] which is nonesense. That’s explain why
one needs to zero out a partition so that DRBD can use it. After that, you can create your file system
on the DRBD device /dev/drbd1 and mount it if you like to.

  • Couldn’t restart my server:

I needed that my DRBD partition be mounted on OS startup. So, as always, I jumped to the /etc/fstab, added joyfully this line:

/dev/sda2       /mountpoint           ext3    defaults        0       2

Restarting my server, boot process stopped at fsck checks of the DRBD partition and didn’t start.No,No,
No..

The only thing I could do is restarting my server at rescue mode with a live CD, everything was ok, though
my system won’t start. Why?

Look at my /etc/fstab line:

The option is defaults which implicitly means mount at boot,
The sixth fied is 2, which tells fsck to check filesystem of my partition at boot time;

So the thing was that the DRBD partition:/dev/sda2 is mounted at boot time. Thus couldn’t start my server
because it hanged out trying to mount a partition that contains no recognizable filesystem.

And even if, I used noauto option, I couldn’t start it because it hanged out while fsck checked a
non-coherent filesystem!!!!

So the solution was to use noauto option with 0 at the sixth field preventing any fsck checks at boot
time.

DRBD is cool and fun, you can use drbdadm utility to manage it and with cat /proc/drbd, you can see the
status of the replication and nodes. I actually used it to write a script monitoring DRBD 8.4 by following
the documentation and doing some logic for cases to monitor;
I also found a good article describing how to use DRBD to do MySQL replication. It’s cool compared to the
master/slave replication done in high level of the O/S, just think to include your mysql configuration
files in the partition to be replicated.

This is in brief, how was my first contact with DRBD. I can also add that DRBD guys at IRC were very
generous and ready to answer every question, thank you guys :).