NexentaStor Server – Part 3 – Internal Benchmarks

Now that NexentaStor is fully configured. Time to do some benchmarks. From eperience random IO was always wicked fast but I found that sequential performance was really dependent on the CPU speed in terms of CPU frequency. In my previous builds typical speeds were about 300MB/s writes and 500MB/s reads. Let’s see if this new server can beat that.

All drives are 146GB 15K SAS Drives.

During these tests Deduplication and Compression are turned off in order to only factor the disks in the equation.

Config: 3 Groups, 2 Drive Mirrors = 6 Drives on SAS6IR. Sync On.

Using DD to generate a sequential write in 16k blocks.

> dd if=/dev/zero of=ddfile1 bs=16k count=3M
^C62850+0 records in
62850+0 records out
1029734400 bytes (1.0 GB) copied, 100.685 seconds, 10.2 MB/s

Ended up aborting this via Ctrl-C as at 10MB/s the benchmark would take a LONG time. 10MB/s is a bit slow. In fact, that’s helluva slow. 10MB writes are actually quite pathetic.

Let’s try to step things up a bit and include all the drives attached.

Config: 10 Groups, 2 Drive Mirros = 20 Drives on SAS6IR and H200 Combined.

> dd if=/dev/zero of=ddfile1 bs=16k count=3M
^C79104+0 records in
79104+0 records out
1296039936 bytes (1.3 GB) copied, 72.4045 seconds, 17.9 MB/s

Wow 18MB/s. I got USB drives that write faster than this.

Ugh. Without a SLOG (or ZIL), performance is going to be crap. NexentaStor is basically waiting on each drive to confirm commitment to disk before attempting to write the next block. Great for data integrity, terrible for performance.

Let’s disable Sync and try again. This time again with only 6 drives.

> dd if=/dev/zero of=ddfile1 bs=16k count=3M
^C1185778+0 records in
1185778+0 records out
19427786752 bytes (19 GB) copied, 63.966 seconds, 304 MB/s

That’s what I’m talking about! Now let’s try it with all the drives.

> dd if=/dev/zero of=ddfile1 bs=16k count=3M
^C1713095+0 records in
1713094+0 records out
28067332096 bytes (28 GB) copied, 68.8867 seconds, 407 MB/s

Success!

Now, a little warning about disabling Sync. Unless the server is on permanent power (UPS, backup genny), disabling is Sync is downright dangerous for your data. A sudden power failure guarantees loss or corruption of uncommitted data. In my case, with my custom built UPS system, I have guaranteed 4 hours of uptime which is plenty of time to commit any writes to disk and gracefully shutdown all servers.

A quick read test

> dd if=ddfile1 of=/dev/null bs=16k
1713094+0 records in
1713094+0 records out
28067332096 bytes (28 GB) copied, 25.3297 seconds, 1.1 GB/s

Whoa! In this case NexentaStor uses BOTH drives in a mirror for reads. Results in INSANE read speeds.

Update: I realized during the benchmarks that one drive was being utilized 90% while others were hovering around 60%. Replacing the drive and resilvering the array yielded absolutely unbelievable numbers.

> dd if=/dev/zero of=ddfile1 bs=16k count=3M
3145728+0 records in
3145728+0 records out
51539607552 bytes (52 GB) copied, 85.7514 seconds, 601 MB/s

And a quick read test.

> dd if=ddfile1 of=/dev/null bs=16k
3145728+0 records in
3145728+0 records out
51539607552 bytes (52 GB) copied, 46.07 seconds, 1.1 GB/s

Am I impressed? Hell yeah. One thing is for certain. the NICs are definitely going to be the bottleneck. Even with 4 way Active/Active MPIO. I’m going to have to start thinking about moving to 10Gb network next year. What good is all that speed if I’m bottlenecked at the NIC.

NexentaStor Server – Part 2 – Configuration

To complete the installation of NexentaStor the system reboots and presents a registration screen with a unique Machine ID. A quick trip to
http://www.nexenta.com/corp/downloads/register-community-download returns an email with the activation code.

The console wizard the proceeds with setting up the management network interface. In this case I select the bnx0 interface that corresponds to the management link I set up earlier.

bnx0 IP Address: 192.168.77.23
bnx0 netmask: 255.255.255.0
bnx0 mtu: 1496 Error.

No worries. There’s a known fix for that. Let’s just skip for now.

Name Server #1: 192.168.77.201
Name Server #2: 192.168.77.202
Name Server #3:
Gateway IP address: 192.168.77.1

Once Nexenta completes internal configuration. Log into the console

username: root
password: nexenta

# setup network interface bnx0 static
bnx0 IP Address: 192.168.77.23
bnx0 netmask: 255.255.255.0
bnx0 mtu: 8982
Warning: changing mtu MAY require driver re-load! Network Interface(s) could be re-initialized.
Change MTU from the current 1496 to 8982 ?  (y/n) y

# setup network interface bnx0 static
bnx0 IP Address: 192.168.77.23
bnx0 netmask: 255.255.255.0
bnx0 mtu: 1500
Warning: changing mtu MAY require driver re-load! Network Interface(s) could be re-initialized.
Change MTU from the current 8982 to 1500 ?  (y/n) y

# ping www.google.ca
www.google.ca is alive

Yay. Server is online. While at it, might as well check for any updates.

# setup appliance upgrade
Checking repository sources. Please wait...
No new upgrades/packages available.

Now to go back to the Web UI to complete the configuration of the filer.

Ran into an issue configuring the Jumbo Frames on the Intel PRO/1000 VT card.
For whatever reason NexentaStor will only allow ports 3 and 4 (igb2/igb3) to bet set to jumbo frames (9K). Attempting to set igb0/igb1 port to Jumbo Frames will result in error:
SystemCallError: failed to configure igb0 with ip 192.168.91.1 netmask 255.255.255.0 mtu 9000 broadcast + up: ifconfig: setifmtu: SIOCSLIFMTU: igb0: Invalid argument
This of course doesn’t make sense since Intel Pro cards are on Nexenta’s HCL. To fix this, gotta get into the guts of the OS. In Console or through SSH:

# option expert_mode="1" -s
# !bash
You are about to enter the Unix ("raw") shell and execute low-level Unix command(s). Warning: using low-level Unix commands is not recommended! Execute? Yes

Now the actual solaris shell is enabled and all commands can be accessed. From here I need to disconnect the igb0/igb1 interfaces

# dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
bnx2 Ethernet down 0 half bnx2
igb2 Ethernet up 1000 full igb2
bnx0 Ethernet up 1000 full bnx0
igb0 Ethernet up 1000 full igb0
bnx1 Ethernet down 0 half bnx1
bnx3 Ethernet down 0 half bnx3
igb3 Ethernet up 1000 full igb3
igb1 Ethernet up 1000 full igb1

# dladm show-linkprop -p mtu igb0
LINK PROPERTY PERM VALUE DEFAULT POSSIBLE
igb0 mtu r- 1500 1500 60-9000

Well, that’s interesting. Why can’t we set it then?

# ifconfig igb0 unplumb
# dladm set-linkprop -p mtu=9000 igb0
# ifconfig igb0 plumb 192.168.91.1 up

Now repeat for igb1. Once igb1 has been configured

# exit
Important: To re-sync the appliance's management state information, please
consider running 'setup appliance nms restart' command.

# setup appliance nms restart
Trying to gain exclusive access to the appliance.
This operation may take up to 30 seconds to complete. Please wait...
Exclusive access granted.
This operation will make Nexenta Management Server temporarily unavailable. Proceed? Yes
Restarting NMS. Please wait...

Now to go back to the web interface and configure the IP. Not sure why this needs to be done again. But it works fine this time.

Once the rest of the wizard is completed the status screen shows the current configuration.

 

In Part 3 I’ll be doing some benchmarks…

NexentaStor Server – Part 1 – The Build

Overview

A while back I picked up an HP DL160 G6 to replacing my aging Dell PowerEdge 2950 II as my NexentaStor server. The end result was less than perfect and I was never happy with the performance of the server. I was seeing performance decrease over the old PE2950II and the two on-board NIC’s were not sufficient to feed all the servers and workstations with data. So I decide to scrap it and rebuild from scratch. With that in mind, I picked up yet another Dell PowerEdge R710 from Kijiji.

This R710 one comes with two X5550 2.6Ghz Quad Core CPU’s and 96GB of RAM. I’ll pilfer some of that RAM as it comes with 8GB sticks and put it to good use elsewhere. Replace the 8GB sticks with 4GB that I have abundance of for a total of 72GB of RAM.

Dell PowerConnect 5224 – The Switch Swicheroo Pt. 2

In my previous blog I started the process of swapping out the Dell network switches in order to replace an aging Netgear GS716T at one of my DC Racks. This time I need to reconfigure the swapped out Dell PowerConnect 5224 switch so that when I rack it up at the data center, only minimal configuration will be required.

This switch is massive compared to the Dell PowerConnect 5324. In my experience it is just as good as the 53 series switches. The feature set is about the same even though the Web UI is slightly different.

To factory reset the PowerConnect 5224 all that’s required is to repeatedly hit Ctrl-F as the switch boots up. It’ll go into a recovery menu. Then all that’s required is to delete the startup-config and set the Factory Default file as the start file

Once the switch is reloaded. Standard config commands will configure the switch for ethernet

Console#configure
Console(config)#interface vlan 1
Console(config-if)#ip address 192.168.77.192 255.255.255.0
Console(config-if)#end
Console#copy running-config startup-config
Console#reload

The switch is now ready to be configured via the Web UI.

Building pfSense Firewall

Few weeks ago I picked up this old Intel SR1530AH server. This server was picked up specifically to act as a firewall for one of my racks at a data center. This is a very basic, almost desktop level hardware. Intel Core 2 Duo 2.0Ghz CPU and 4GB of RAM. The only thing that is non-desktop is the fact that it is ECC ram.

This is actually a very nice machine for the purpose. Based on previous experience with 2.0Ghz it should be able to sustain approx 300MBit/s unencrypted traffic across interfaces. Which is quite plenty since the uplink to the internet is only a 100Mbit connection. Now, because this server only comes with two onboard network interfaces, I needed to add an additional network card. I had an extra low profile riser in my junk drawer so couple of minutes later I added a dual port Intel Pro nic. This gives me the minimum 3 interfaces I require (WAN, LAN and DMZ). I could potentially use the third interface as a bridged DMZ for some of the VMs.

In order to install pfSense on the server, I needed persistent storage. I chose to go with the OCZ Agility 3 60GB SSD drive. Seems like overkill since the installation of the firewall is only about 6GB including swap space. But SSD’s are pretty cheap nowadays and since pfSense hardly ever writes to the local storage and the lack of moving parts should equate long drive life.

Ran into a small issue installing the 2.5″ SSD into the server as the adapter that comes with the drive is insufficient to mount it. Thankfully, I’ve had several of these “Ice Packs”. These came off of several Western Digital Velociraptors. While it looks like a heavy duty heatsink, it’s main purpose is to simply adapt the 2.5″ Velociraptor size to a 3.5″ hot-swap size.

Couple of screws and the Agility nicely sits in the Ice Pack, ready to be installed into the server.

The disk drive now installed into the server. Hopefully it won’t have to leave it’s home for a very long time.

Time to install pfSense. This simply involves downloading the ISO image from the pfSense web site. Uncompressing it and burning to the CD. The uncompressed image is only 115MB so the burn process is very quick.

After that, simply connect the USB DVD Drive and pop the pfSense CD inside and boot-up the server.

Couple of minutes and few wizard like questions and pfSense is ready to be configured via a browser.

Last step is to label the network interfaces on the back of the server.

The server/firewall is now ready to be racked up. Should be going into the datacenter on Wednesday.

Dell PowerConnect 5324 or The Switch Switcheroo

One of my racks at a datacenter has been having some connectivity issues. There’s a possibility that the Netgear GS716T uplink switch is starting to flake out. So I wanted to replace the switch while it’s still running. Canada Post dropped off this little gem from eBay today. Dell PowerConnect 5324. 24 Port Gigabit Managed Switch. I’m running several of these switches with great success.

The idea is to take my existing PowerConnect 5224 from home and replace the datacenter GS716T with it. Then use the new 5324 at home. The PowerConnect 5224 has been proven very reliable and very low latency. If the GS716T is problematic, I know the Dell will show it.

First step is to get this switch configured. This means resetting is to factory settings

Reset To Factory Settings

> enable
> config
> delete startup-config
> reload

Configure IP Address

> enable
> config
> interface vlan 1
> ip address 192.168.77.250 /24
> ip default-gateway 192.168.77.1
> exit
> exit
> copy running-config startup-config

Reload to make sure it saved properly

> reload

Create a web UI user

> enable
> config
> username admin password admin level 15
> copy running-config startup-config

A quick check in the browser and looks like we’re fully operational.

Always a good idea to update the Firmware before it’s put into production. So next step is to hit the Dell Support Site, punch in the Service Tag number and get the latest firmware.

To load the firmware onto the switch, I need a TFTP server running. Over the years I’ve been using this software. No install required, just execute and drop files into its root. I simply renamed the Dell firmware files, boot file to boot.rfb and image file to image.ros and copied right into the TFTP root.

Then from the console

> copy tftp://192.168.77.6/image.ros image
> copy tftp://192.168.77.6/boot.rgb boot
> The copy operation has failed
Copy:
Due to boot initial state, update should be done twice.
Please download the same file again to complete the process.
> copy tftp://192.168.77.6/boot.rgb boot
The copy operation was completed successfully

Now we need to change the boot flash block. Uploading files to the switch always uploads to the non-active block.

> show bootvar
Images currently available on the FLASH
image-1 not active
image-2 active (selected for next boot)

in this case, it was already defaulting to image-2 so we now need to boot to image-1

> boot system image-1
> reload
> enable
> show version
SW version 2.0.1.4 ( date 01-Aug-2010 time 17:00:12 )
Boot version 1.0.2.02 ( date 23-Jul-2006 time 16:45:47 )
HW version 00.00.02

Now all that’s left is to configure the VLAN’s and Trunking to mimic the existing configuration and just do a straight swap.

Incidentally, I logged onto my GS716T at home while mapping ports and was greeted with this:

I guess I haven’t fiddled with that switch in a while, if it ain’t broke….

Busy Weekend

Had a quite productive weekend. Had to put UltraDMM development on hold for a while as I had to deal with some home network/server issues and catching up on some development for clients.

Picked up an HP (yeah I know) DL160 G6 server from Kijiji. This server has replaced a Dell PowerEdge 2950 II as my NexentaStor server. With 40GB of RAM, a faster CPU it should make the use of deduplication a lot more enjoyable, with the limited 16GB RAM of the PowerEdge 2950, large file deletions would hang the box for quite a while. The DL160 is connected to the Dell PowerVault MD1000 via a Dell H200 6Gbps HBA. Using an old OCZ Vertex as SSD Read Cache. This makes for a very speedy combination. Presenting the datastores to the vSphere server using a combination of iSCSI and NFS pools.

Also picked up this “older” Intel SR1350 server. These make absolutely great pfSense firewall boxes. I’ve tried many firewalls over the years. From simple software based routers to dedicated hardware firewalls (insert Sonicwall Pro 2040 rant here). Once I found pfSense and all the features that it offers, I pretty much stopped looking at anything else. Not only the firewall comes with pretty much every possible feature, it’s extensible via plugins. And with this server, I should be able to easily attain 50MB/s throughput between zones which is perfect for small virtual environment.

The server comes with two on-board Intel NIC’s. I’ve added an additional dual port Intel PRO/1000 nic. This gives me 4 full speed zones (WAN, LAN, DMZ and Bridge). The only thing I need is a small SSD that I use for the boot drive.

Also picked up this Dell PowerEdge R610. To be honest, I haven’t figured out a good use for it yet, but it was a great deal worth snagging. For now I’ll probably install Windows 2012 Storage Server and attach some eSATA boxes to it for testing.

In the last few days I picked up enough servers to retire my last batch of Dell PowerEdge 2950’s. These will be going on Kijiji for sale, hopefully someone will give them a good home.

Now back to UltraDMM development.

SR1680MV – Continued

SR1680MV’s will be receiving some eBay low profile nics that arrived today. Intel Pro/1000 Dual Port PCI-e x4 cards.

I originally mounted them in the second server ‘s backplane. This SR1680MV was originally supposed to run ProxMox but it occurred to me that it would be a better solution to have one node from each server run ProxMox and the other node vSphere. This is because these servers do not have redundant power supplies. So that if one server’s power supply fails, my entire infrastructure won’t come crashing down.At worst, one node from each cluster will be affected, not the entire cluster.

I’ve added the new power LED’s to the second set of nodes too. All racked up and ready for testing. I’m hoping to have these servers racked up in the datacenter by next weekend.

 

Intel SR1680MV (Part 2)

Continued working on the server today. Installed a couple of Intel Pro/1000 VT Quad Port network cards. 2 Ports for SAN, 1 port for LAN and 1 port for DMZ/WAN traffic for each node. The cards I had did not have low profile brackets so I ended up rigging the cards so they wouldn’t move. Not the best fix, but since this server will stay home for Lab work and testing, it’s not really all that important.

Both cards installed and ready to be plugged back into the server. The server does have rear RJ45 jacks on it, but they do not seem to be used for typical network purpose as they do not light up when hooked up to a switch. From what I read, these servers required a Liquid Computing switch to operate.

vSphere had no trouble recognizing the Intel NICs.

 

Server racked up. The other SR1680MV server will be operated on once my low profile network cards arrive.

 

Network cables hooked up…

 

…and patched into the switches. The Dell 5324 takes care of SAN traffic. The Dell 5224 takes care of LAN/DMZ traffic. The networks are segmented on different VLAN’s too.

 

Noticed that after few minutes after bootup the LED’s on the front of the servers go off. I assume due to the custom nature of the servers, the LEDs have other meaning past bootup. It shouldn’t be to hard to add a power LED indicator as I saw that the nodes have an internal molex header where I can draw 5V from to power the LED. I’ll make this my next project.

Adding the server to the cluster was a snap. Will create some test VMs to stress these nodes for a few days.

100% Super Happy Network Failure Occurence

The D-Link DGS-3024 finally bit it.It hasn’t occurred to me that there could be a problem, even though my network throughput speeds have absolutely plummeted. I was able to move a file across the network at easy 100MB/s. Last time I copied an iso, it was moving at only 11MB/s. Of course, figures that the switch will go down while I’m doing on-site at a client. Took the whole network down, including the Site-To-Site VPN tunnels between my two DCs. No email, no source control, no nothing.

Oh well, I bought that switch on Kijiji about  3-4 years ago for about $80. Can’t complain really. I wonder now if I can get bring it back to life.

It’s a good thing I held on to an extra Dell PowerConnect 5224 I bought on eBay about 8 months ago when I was setting up a new half-rack at one of my DCs.

What a pain the the ass it was to replace it though. With all the patch cables between the two switches and the patch panels, I had to disconnect everything just to pull the dead switch out of the rack. Then of course, I figured if I’m gonna go that far, might as well reorganize the whole network. Pulled everything out, switches, patch cables, unplugged all servers. Took about 4 hours to rewire everything just enough to get me back up and running. This included all the necessary connections and configuring the switches for the VLAN’s and Trunking.

Once I get my new servers racked up I’ll wire the rest.

Just noticed how filthy the server case below the switch is. I guess I got some house cleaning todo.