Beyond php - it's not (just) about the code

download Beyond php - it's not (just) about the code

If you can't read please download the document

description

Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.

Transcript of Beyond php - it's not (just) about the code

  • 1. Wim Godden Cu.be Solutions @wimgtr Beyond PHP : It's not (just) about the code

2. Who am I ? Wim Godden (@wimgtr) 3. Where I'm from 4. Where I'm from 5. Where I'm from 6. Where I'm from 7. Where I'm from 8. Where I'm from 9. My town 10. My town 11. My town 12. Belgium the traffic 13. Who am I ? Wim Godden (@wimgtr) Founder of Cu.be Solutions (http://cu.be) Open Source developer since 1997 Developer of OpenX, PHPCompatibility, PHPConsistent, Nginx SLIC, ... Speaker at PHP and Open Source conferences 14. Cu.be Solutions ? Open source consultancy PHP-centered Training courses High-speed redundant network (BGP, OSPF, VRRP) High scalability development Nginx + extensions MySQL Cluster Projects : mostly IT & Telecom companies lots of public-facing apps/sites 15. Who are you ? Developers ? Anyone setup a MySQL master-slave ? Anyone setup a site/app on separate web and database server ? How much traffic between them ? 16. The topic Things we take for granted Famous last words : "It should work just fine" Works fine today might fail tomorrow Most common mistakes PHP code PHP ecosystem 17. It starts with... code ! First up : database 18. Database queries complexity SELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start AS event_start_orig, e.event_end, e.event_end AS event_end_orig, e.timezone, e.has_time, e.has_end_date, tz.offset AS offset, tz.offset_dst AS offset_dst, tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_user, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_site, tz.name as timezone_name FROM node n INNER JOIN event e ON n.nid = e.nid INNER JOIN event_timezones tz ON tz.timezone = e.timezone INNER JOIN node_access na ON na.nid = n.nid LEFT JOIN domain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'en' WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all'))) AND ((da.realm = "domain_id" AND da.gid = 4) OR (da.realm = "domain_site" AND da.gid = 0)) AND (n.language ='en' OR n.language ='' OR n.language IS NULL OR n.language = 'is' AND i18n.nid IS NULL) AND ( n.status = 1 AND ((e.event_start >= '2010-01-31 00:00:00' AND e.event_start = '2010-01-31 00:00:00' AND e.event_end = '2010-02-01 00:00:00' AND event_start = '2010-02-01 00:00:00' AND event_end filterByState('MN') ->find(); foreach ($customers as $customer) { $contacts = ContactsQuery::create() ->filterByCustomerid($customer->getId()) ->find(); foreach ($contacts as $contact) { doSomestuffWith($contact); } } 28. Joins $contacts = mysql_query(" select contacts.* from customer join contact on contact.customerid = customer.id where state = 'MN' "); while ($contact = mysql_fetch_array($contacts)) { doSomeStuffWith($contact); } or the ORM equivalent 29. Better... 10001 1 query Sadly : people still produce code with query loops Usually : Growth not anticipated Internal app Public app 30. The origins of this talk Customers : Projects we built Projects we didn't build, but got pulled into Fixes Changes Infrastructure migration 15 years of 'how to cause mayhem with a few lines of code' 31. Client X Jobs search site Monitor job views : Daily hits Weekly hits Monthly hits Which user saw which job 32. Client X Originally : when user viewed job details Now : when job is in search result Search for 'php' 50 jobs = 50 jobs to be updated 50 updates for shown_today 50 updates for shown_week 50 updates for shown_month 50 inserts for shown_user = 200 queries for 1 search ! 33. Client X : the code foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); } 34. Client X : the graph 35. Client X : the numbers 600-1000 inserts/sec (peaks up to 1600) 400-1000 updates/sec (peaks up to 2600) 16 core machine 36. Client X : panic ! Mail : "MySQL slave is more than 5 minutes behind master" We set it up who did they blame ? Wait a second ! 37. Client X : what's causing those peaks ? 38. Client X : possible cause ? Code changes ? According to developers : none Action : turn on general log, analyze with pt-query-digest 50+-fold increase in 4 queries Developers : 'Oops we did make a change' After 3 days : 2,5 days behind Every hour : 50 min extra lag 39. Client X : But why is the slave lagging ? Master Slave File : master-bin-xxxx.log File : master-bin-xxxx.logSlave I/O thread Binlog dump thread Slave SQL thread 40. Client X : Master 41. Client X : Slave 42. Client X : fix ? foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); } 43. Client X : the code change insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), on duplicate key ; insert into shown_week values (5, 1), (8, 1), (12, 1), (18, 1), on duplicate key ; insert into shown_month values (5, 1), (8, 1), (12, 1), (18, 1), on duplicate key ; insert into shown_user values (5, 23, "2013-11-12 12:01:00"), (8, 23, "2013-11-12 12:01:00"), ; 44. Client X : the code change $todayQuery = " insert into shown_today( jobId, number ) values "; foreach ($jobs as $job) { $todayQuery .= "(" . $job['id'] . ", 1),"; } $todayQuery = substr($todayQuery, 0, strlen($todayQuery) - 1); $todayQuery .= " ) on duplicate key update number = number + 1 "; $db->query($todayQuery); 45. Client X : the chosen solution $db->autocommit(false); foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); } $db->commit(); 46. Client X : conclusion For loops are bad (we already knew that) Add master/slave and it gets much worse Use transactions : it will provide huge performance increase Result : slave caught up 5 days later 47. Database Network Customer Y Top 10 site in Belgium Growing rapidly At peak traffic : Unexplicable latency on database Load on webservers : minimal Load on database servers : acceptable 48. Client Y : the network 49. Client Y : the network 60GB 700GB 700GB 50. Client Y : network overload Cause : Drupal hooks retrieving data that was not needed Only load data you actually need Don't know at the start ? Use lazy loading Caching : Same story Memcached/Redis are fast But : data still needs to cross the network 51. Network trouble : more than just traffic Customer Z 150.000 visits/day News ticker : XML feed from other site (owned by same customer) Cached for 15 min 52. Customer Z fetching the feed if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml') ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); What's wrong with this code ? 53. Customer Z no feed without the source Feed source 54. Customer Z no feed without the source Feed source 55. Customer Z : timeout default_socket_timeout : 60 sec by default Each visitor : 60 sec wait time People keep hitting refresh more load More active connections more load Apache hits maximum connections entire site down 56. Customer Z : timeout fix $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); 57. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); 58. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); 59. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', $feed ); } } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); 60. Customer Z : process early $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', ParseXmlFeed($feed) ); } } 61. Customer Z : file_[get|put]_contents atomicity $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', ParseXmlFeed($feed) ); } } Relying on user concurrent request possible corrupt data Better : run every 15min through cronjob 62. Network resources Use timeouts for all : fopen curl SOAP Data source trusted ? setup a webservice let them push updates when their feed changes less load on data source no timeout issues Add logging early detection 63. Logging Logging = good Logging in PHP using fopen bad idea : locking issues Use monolog : file, syslog, mail, Pushover, HipChat, Graylog, Rollbar, ElasticSearch (and 50 more) For Firefox : FirePHP (add-on for Firebug) Debug logging = bad on production Watch your logs ! Don't log on slow disks I/O bottlenecks 64. File system : I/O bottlenecks Causes : Excessive writes (database updates, logfiles, swapping, ) Excessive reads (non-indexed database queries, swapping, small file system cache, ) How to detect ? top iostat See iowait ? Stop worrying about php, fix the I/O problem ! Cpu(s): 0.2%us, 3.0%sy, 0.0%ni, 61.4%id, 35.5%wa, 0.0%hi, 0.0%si, 0.0%st avg-cpu: %user %nice %system %iowait %steal %idle 0.10 0.00 0.96 53.70 0.00 45.24 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 120.40 0.00 123289.60 0 616448 sdb 2.10 0.00 4378.10 0 18215 dm-0 4.20 0.00 36.80 0 184 dm-1 0.00 0.00 0.00 0 0 65. Much more than code DB server Webserver User Network XML feed 66. Look beyond PHP (or Perl, Ruby, Python, ...) ! 67. Questions ? 68. Questions ? 69. Contact Twitter @wimgtr Web http://techblog.wimgodden.be Slides http://www.slideshare.net/wimg E-mail [email protected] Please... Rate my talk : http://joind.in/11676 70. Step-by-step : most common issues Using NFS ? Get rid of it ;-) iowait on database server I/O reads (use iostat) ? missing/wrong indexes I/O writes ? no transactions ? too many queries ? bad DB engine settings iowait on webserver (logs ? static files ?) CPU on database server (missing/wrong/too many indexes) CPU on webserver (PHP)