WebHostingTalk, one of the largest online forums for discussion of Webhosting and Server related issues, was maliciously attacked over the weekend.
A hacker gained access to an offsite backup server and then used information on that server to walk into the main live server. The hacker deleted the backup databases, and then deleted the live site. Apparently, they also covered their tracks and over wrote the drives so that no possibility of recovery was possible.
On a forum post a community member of WHT revealed the following:
This attack was very deliberate, sophisticated and calculated. The attacker was able to circumvent our security measures and access via an arcane backdoor protected by additional firewall. We are still investigating the situation, but we know the attacker infiltrated and deleted the backups first and then deleted three databases: user/post/thread. We have no record or evidence that private message data was accessed. Absolutely no credit card or PayPal data was exposed.
Unfortunately for WebHostingTalk, the last local offline copy of the system is from late last year. So expect them to be offline for a bit, while they rebuild their database.
It just goes to show how important offline backup is. Make sure you have the correct backup solution.
Read the rest of this entry »
Posted in Internet, Security | No Comments »
Recently posted on the Facebook blog:
Almost two million new users from around the world sign up for Facebook each week—and we couldn’t be happier. It’s tremendously rewarding to see so many people find what we work on useful and fun. As we continue to add new users and features, however, the load on our thousands of servers continues to increase at a pretty astounding rate. A few weeks ago we reached full capacity in our California datacenters. In the past we handled this problem by purchasing a few dozen servers, hooking them up, and getting on with our lives, but this time we didn’t have it so easy. We’d actually run out of space in our datacenters for new machines.
Fortunately we saw this problem coming a long time ago and started work on a new datacenter in Virginia. Now, we identify whether a user would be better off talking to the east coast datacenter or a west coast data center. For people in Europe and the eastern half of the US, it’s noticeably faster to talk to a server in Virginia than in California. For these users we direct them to Virginia whenever they’re browsing the site and not making any changes.
Whenever that person goes to change some data—uploading a photo album, or changing profile info for example—we send them off to California so that all our modifying operations happen in the same location. This decision was made to prevent two or more modifications from conflicting with each other and messing up our data. It might sound like we’re forcing our users to go to California a lot but only about 10% of our traffic causes a modifying operation. MySQL has a great replication feature that allows us to, in real time, stream all the modifications happening on a California MySQL server to another one in Virginia. Replication happens so fast, even across the country, that the Virginia servers are almost never more than one or two seconds behind the California servers.
Even though all of the modification happens in California and streams instantly to Virginia, we were faced with another problem. Although Facebook’s data is stored in MySQL database servers, we use a large number of memcached servers to store copies of the data. Memcached is much faster and able to keep up with requests quicker than the databases themselves can keep up. We had to figure out a way for memcached servers to replicate data concurrently with the MySQL databases. Because of various technical limitations of our architecture there was no easy way to do so.
Fortunately MySQL is open source software, meaning we can actually change the way it works by modifying the code. We did just that—embedding extra information in to the MySQL replication stream that allows us to properly update memcached in Virginia. This ensures that the cache and the database are always in sync. Over the last seven months a great team of Facebook employees has been building new software and setting up new servers like I described above. Over Thanksgiving we finally flipped the switch and since then almost 30% of our traffic has been served from Virginia.
The east coast datacenter is a great first step towards keeping Facebook fast and reliable as the site grows. Going forward we have lots of exciting plans to expand our infrastructure and improve performance so no user ever has to sit around waiting for a page to load.
Read the rest of this entry »
Posted in Facebook, Internet | No Comments »
Companies can now go ahead and fire their expensive database administrators—those engineers who keep the Oracle or IBM databases humming. Amazon has just added an enterprise-class database called SimpleDB to its suite of cloud-based IT infrastructure, which also includes storage (S3) and computation (EC2) available by the drink. Today, Amazon is taking sign-ups for the SimpleDB beta, which should start in a few weeks. As it points out on the new Simple DB page:
Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers.
Traditionally, this type of functionality has been accomplished with a clustered relational database that requires a sizable upfront investment, brings more complexity than is typically needed, and often requires a DBA to maintain and administer. In contrast, Amazon SimpleDB is easy to use and provides the core functionality of a database – real-time lookup and simple querying of structured data – without the operational complexity. Amazon SimpleDB requires no schema, automatically indexes your data and provides a simple API for storage and access. This eliminates the administrative burden of data modeling, index maintenance, and performance tuning. Developers gain access to this functionality within Amazon’s proven computing environment, are able to scale instantly, and pay only for what they use.
This will be especially attractive for Web startups. Amazon has just taken another major infrastructure cost off the table for them. Relational databases are expensive to buy and maintain. Whatever features or performance SimpleDB lacks, it should make up for in price. Amazon wants to democratize the database by making it available to more businesses, and even individuals, thus leveling the playing field between big companies and startups even more.
And since SimpleDB operates at Web scale, larger companies will wake up to the cost saving opportunities of such a service as well. IBM, for one, is already trying to preempt any customer defections with its copycat Blue Cloud initiative. If speed is of the essence, you might still want to keep your database on your own servers. But the Web is where most software will one day live, whether consumer or enterprise. And Amazon’s got nothing to lose by speeding that day along.
Pricing for SimpleDB is as follows:
Machine Utilization – $0.14 per Amazon SimpleDB Machine Hour consumed
Data Transfer
$0.10 per GB – all data transfer in
$0.18 per GB – first 10 TB / month data transfer out
$0.16 per GB – next 40 TB / month data transfer out
$0.13 per GB – data transfer out / month over 50 TB
Data transfer “in” and “out” refers to transfer into and out of Amazon SimpleDB. Data transferred between Amazon SimpleDB and other Amazon Web Services is free of charge (i.e., $0.00 per GB).
Structured Data Storage – $1.50 per GB-month
Read the rest of this entry »
Posted in Internet, Programming, Software | No Comments »