Large scale databases with PHP/Mysql

by @ 8:38 am on December 20, 2007.  Filed under Mysql, PHP

I’m currently faced with a paradox. I have a huge database (one table is more than 84 million records) that requires daily updates of about 2 m records and probably daily inserts of about 500k or more. Currently, I am trying to achieve this with Mysql however I am finding that the server I’m hosting the application is far from ideal. So my dilemma is, do I spend my hard earned money on a beefy server (>$400 a month) or do I build one and keep it here at my house on my Verizon Fios fiber connection. If I do the first option, it eats into my hard earned cash. If I do the second option, I have to figure out how to connect my home box to my live website in a reliable manner.

The downside to the fios connection is, I do not get a static IP. Also, I have a suspicion that they are blocking incoming ports.

I’ve decided to take the latter route, build a machine for home that will host the database on some RAID1 drives. I have a few SCSI 320 drives as well, so a RAID 10 setup would be optimal. But my first attempt is going to be on an SATA Raid card made by Silicon Image. The complexities of this application are too numerous to go into, but the general goals are to parse large text files, analyze them and store the results in the database. The issue lies in the difficulty of applying these massive updates to Mysql in a way that won’t take half a day to process. It’s a two part equation. The server that is doing the data analysis has to be respectable in terms of hardware/cpu/memory, however, the live web server which also hosts a “copy” of the database for the website to actually use. Uploading the daily updates to the live server will be the bottleneck in terms of system performance. Tests will tell if my RAID on SATA will be of any benefit. If not, I will switch to my SCSI RAID and work from there. I’ve been searching Google for answers on the topic of large scale data warehouses and data mining but I am coming up empty handed. Got any ideas?

web·pit n. a place of discovery; a repository of information; where coders come for enlightenment; a programmers diary.

internal links:

categories:

search:

archives:

December 2007
M T W T F S S
    Jan »
 12
3456789
10111213141516
17181920212223
24252627282930
31  

other:

All content Copyright (c) 2008, Mark Williamson