Making use of a SAN (storage area network) provides some incredible benefits. I won’t go into depth but at a high-level you often get:
1. Excellent hardware redundancy for data storage, more-so if you are using multiple arrays but even most enterprise single arrays can provide N+1 redundancy. Now we can tolerate power failures, and drive failures, and switch failures, etc…
2. Extra options for historical data integrity/backup/dr – Most enterprise SAN’s support features for volume snapshots and rollbacks. Some even support advanced features specific to protecting MS-SQL and I am sure other database products. Our implementation also provides some great options for DR, like being able to replicate data/volumes from a production SAN over to a different SAN in a different network/datacenter.
3. Administrative ease… managing storage volumes for all of your systems from one interface makes life much easier.
4. Online disk resizing — did your database run out of disk space? You have plenty of space available on your SAN though on which the volume is hosted? No problem, just increase the size of the volume on the SAN (often something you can do while the volume is online and being used) and then increase the partition in windows to take up the new volume space (also an online operation).
For these reasons (and I am sure many many more), SAN’s have become a staple in a lot of enterprise networks. But let me talk about some pain points, particularly in older SAN implementations and particularly around iSCSI and older networks.
Apparently a handful of customers using Cloudflare for DNS, and specifically CNAME records experienced a brief outage of name resolution services on New Year’s. I found the reason why to be rather interesting. Devs at cloudflare assumed time can’t move backwards… An understandable assumption but actually faulty because of leap seconds… Anyhow, if you do programming you might find the root cause analysis for this hiccup to be interesting and informative:
Well worth a quick read. No, unfortunately it didn’t have anything to do with Dark Matter and/or what happens were a black hole and a Delorian traveling at 88 mph suddenly to meet while Superman flies around the planet at light speed. But, it is still curious enough all the same. Happy New Year… Sanitize your outputs…
Without going into great technical detail (which on this topic I couldn’t do anyway), it seems after much reading that it is a recommended practice to spread you SQL Server TempDB across multiple files based on how many cores (or perhaps threads) your processor has.
To keep things simple, let’s say I have a 4 core CPU and no hyper-threading (I am not sure if the rule applies to physical cores or to threads), this means I want to split my TempDB up into four different files. However there is one caveat, you should only do this if you actually have four separate physical drives. Not separate files on one drive, not even separate files on separate partitions… this is only beneficial if you actually have separate physical drives based on what I read.
I have been working on email stuff on and off for the last… forever.
One of the very handy and easy to use tools to have in one’s pocket for testing email functionality with a particular SMTP server is the ability to quickly send email through a selected server from the command line. Just like using telnet to test ports makes day-to-day IT life easier vs. having to grab and install some extra tool, so also does the ability to shoot emails off from a CLI.
I have been fiddling about with setting up a SQL Server 2012 Failover cluster using an Equallogic SAN. After a whole lot of digging about I found two different posts on two different sites which got me about 90% of the way there. However there were some key “gotcha’s” and other information that was missing in both cases and I wanted to document those here in addition to referencing the articles I followed for my setup.
BTW – Just my 2-cents, but setting up clustering is complicated… especially when you throw SQL in the mix. It isn’t bad once you have done it a few times (I tested again, and again, and again in a virtual environment) but there are honestly like 50+ considerations to take into account to ensure everything goes correctly.
I am assuming if you are here you already have a general understanding of failover clustering, know what you are wanting to do and why. This article also doesn’t really cover all aspects of high-availability. I don’t discuss how your SAN(s) should be networked for example. I do touch on a few items though that fall in this area. This isn’t meant to be comprehensive and a lot of it is just for personal reference.
So here are some tips if this is your first go around. These are in NO particular order or grouping (this is very “stream of thought”) so I would suggest reading this from start to finish at least once rather than referencing it as you are going through your setup.