Just another site

VM Cluster using HA Starwind iscsi storage and you don’t have NASA style data centre redundancy? don’t bother yet. Major problems.

I’ll leave this in place, but should mention that the concerns i have below are not valid anymore, in version 5.7. See my follow up post here: StarWind 5.7/5.6 and recovery from both nodes down.


Seems starwind consider HA to be slightly more exclusive than their site and marketing blurb let on.

I understand that true HA means, never, ever off, but even investment banks have occasional power-downs, just to prove they can start systems up again afterwards. Beware though, if you ever (and i mean EVER) want to contemplate turning your clustered storage off for a period of time due to a building power cut/act of god/whatever, for now, pick another solution.

It works great if one node is up full time, which i suppose if you are NASA is possible, but its good practice for all organizations to do an occasional poweroff, and every so often you know, even in London, you have a long power outage, or there is building maintenance.

Essentially, the issue is if you power down both nodes of a storage cluster gracefully after powering down your hyper-v/xen/vmware cluster you will not be able to get them up again without MANUALLY specifying the most recent copy of the data (a major issue if you get this wrong and are running any db app) then sitting through a FULL synchronisation, 200gb took almost 12 hours in my test environment during which the cluster was inaccessible as the storage was not accepting incoming connections. In production this would mean your supposedly HA environment would be offline until the storage had done a pointless full sync between nodes.

I checked out the Starwind forum where they claim this is by design, this is totally ridiculous. There are degrees of HA, and it’s not often a midsize company can afford separate power supply companies at either end of the building, which seems to be where most people lose out, for example, we planned to have redundant hosts, redundant storage units, redundant switches all on redundant UPS’s but we only have one provider supplying electricity, to totally eliminate the viability of this platform by not implementing a last write flag on the storage is insane.

Essentially this means a great product is ruined for a large number of it’s users. A real shame. There is a workaround, outlined in this link, but it’s risky and involves judging yourself which replica is most current, deleting the targets, recreating and then recreating ALL iscsi connections on the cluster? absolutely crazy. In my test environment this took me almost an hour first time round.

Check this out:

If anyone else has had their implementation hobbled by this oversight I’d love to hear from you. I’d also be keen to hear when this is addressed in a workable way by Starwind as this does not seem to be a feature they shout about in the marketing department.


7 responses to “VM Cluster using HA Starwind iscsi storage and you don’t have NASA style data centre redundancy? don’t bother yet. Major problems.

  1. Anton Kolomyeytsev March 6, 2011 at 8:27 pm

    The God had created this world in only 7 days. StarWind iSCSI SAN is much less complex then say ATP metabolism reaction in human brain cell, it’s not as perfect as human rights regulation model in mainland China and it’s not that iconic as Nissan Skyline GT-R R34. I’m not a God! It’s hard to accept but I’m doing my best in living with this every day of my life. That was preamble… Now dry facts come. Indeed when we’ve been creating HA we did not realize people would tend to put both HA nodes down at the same time. It sounded crazy to us let’s call the things with their real names. It puts whole idea of having HA head over heels. But people do this. And they tell us pretty solid theory why they do it. Making long story short it happens because this world is not that immaculate as we’d like it to be. And not everybody can afford to have real big UPS to keep server up if cable power goes AWOL for a while. That’s perfectly fine! We’ve accepted this as “Sad but true” © … and working on an improvement facing this issue. We’ll add journaling to StarWind database transaction log and we’d be able to assign the node with most up to date data as a primary point of recovery. So immediately after Mr. Power-Where-Have-You-Been-All-Night-Long will return from your local tap room you’ll have at least partially working storage cluster (partially because we have to synchronize all storage nodes to ensure your data is coherent) able to serve your clients just fine. And it should happen in fully automatic mode. Or semi-automatic if operator would like to pull the trigger by assigning recovery point node himself. We’re flexible! It’s your cluster and your data anyway. And we’re already working on this and issue-free version is expected to be released as version 5.7 in a couple of weeks from now. Not later.

    See all software in general and software release builds in particular have internal issues. Let’s take StarWind for example. StarWind versions before 5.0 had no HA at all, only synchronous de-multiplexing mirror. Data saved to both location but single point of failure still with us… Too bad! Version 5.0 had fixed this adding fully symmetric HA in Round-Robin mode. But 5.0 had faulty transport causing BSODs. Real stopper! Versions from 5.2 to 5.4 had fixed this. But we’ve got brain split issue in all versions before 5.4 and successors. Version 5.4 had added heart beat cluster so we don’t have HA issue any more. Version 5.4 has issues with slow synchronization and 10 GbE support. We’ve improved this in 5.5 and put a redesign mile stone for version 5.7 and wow 5.6 had added something called de-duplication first time in our history. It also has issues reported by customers. Huge memory consumption if you care and we’ll fix them partially in 5.7 and also fix both nodes down issues completely and we’ll replace Microsoft MPIO with our custom MPIO as Microsoft does not work well on 10 GbE yet. Version 5.7 will have own specific problem for sure but we’ll face them in 5.8 and up. It’s never stopping process. We do catch up every single release. Fix and improve something every single day. According to Sir Darwin we’re part of the thing called *Evolution* and we do indeed follow this model entirely.

    I really appreciate your feedback. Negative feedback is like a pain. It’s your friend telling you something goes wrong with particular part of your body or some organ inside it. But if you want brain to react on your message please make sure brain has a chance to see your message delivered. If you have problems with me, my software or my staff – do me a favor and write on our forum. I came around your post in your blog and now I’m frustrated… Do you really want us to fix our broken software, help you and our customers and make this world a little bit better or do you want some occasional visitors occasionally find out you’re pissed off with our software?

    Anton Kolomyeytsev
    CTO, StarWind Software

    P.S. I’ve put a backlink to your blog from our forum so please don’t remove your article at least for some time. Thanks!

    • ccolonbackslash March 6, 2011 at 8:56 pm

      Hi Anton,

      Appreciate your detailed and passionate response, my apologies for not putting this in your forum, I had posted asking when this feature would be implemented but did not receive a response (jimbul) as yet.

      No way i’ll remove this if you’ve linked to it. Great to hear feature requirement is being added as its stopping me making full use of the software at present and as I stated, as far as I am concerned it works superbly.

      I am delighted you address customer grumbles so rapidly and have actually recently purchased two 8tb enterprise licenses in January. Though am ashamed to say that despite testing it for a number of weeks i never instigated a full shutdown….. (red face), a shameful admission and the root cause of my frustrated post.

      Can we move on from here or is that it?


      Mrpowerwherehaveyobeenallnight (drunk).

  2. Anton Kolomyeytsev March 6, 2011 at 8:32 pm

    And I have an impression you did not really run StarWind. Because what you say here is not completely true. You cannot have true HA cluster immediately after power is back b/c we need to verify data being the same. But one node is working and serving the requests. After data will be verified we’ll bring back multipach (full HA) and turn Write-Back Cache ON.


  3. ccolonbackslash March 6, 2011 at 9:04 pm

    I am running 5.5, if I have it configured incorrectly and it does gracefully recover from a full shutdown i hold my hands up. I have repeatedly tried this however and short of a full sync or recreating nodes I could not see a way to do this. I realize 5.6 is available but could not see that facility there when I looked.

  4. Anton Kolomyeytsev March 15, 2011 at 5:28 am

    1) There’s no problem to write where you want! You just supposed to get help at some fixed place. I guess you’ve paid for it anyway 🙂

    2) StarWind automatically decides should it go for fast or full sync. Do you want to override this?


    P.S. Can move off-line or e-mail not to confuse people. Or as you wish…

  5. ccolonbackslash March 15, 2011 at 11:27 am

    Hi Anton – Thanks very much for your reply.

    1.) 🙂 we did, and i’m greatly excited about having it in place.

    2.) I would like StarWind to make this decision for me, i believe it is much better qualified to make that decision. What i’m hoping for with is to be able to do the following in the event of a power outage that exceeds our UPS’s capacity:
    -First Shut down hyper-v cluster
    -Then Shut down StarWind HA array and StarWind know which side was “freshest” or that both were in sync.
    – Once power returns, Start up StarWind and it automatically do whatever is appropriate for a sync, while this is going on my storage is still accessible by my cluster even if it is doing a full sync as it is aware of which side was last written to/most up to date and can redirect traffic to that side.

    I fully acknowledge some of this may be impossible, unworkable, or depend on other infrastructure or services that i’ve not considered.

    I am happy to move to email if that’s better for you, or please direct me to rtfm if i’ve not read some document i should have.

    Thanks again,


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: