Planet PHP Feed dead?
I think WordPress.com has changed my feed URLs. I had to resubmit them to both Planet MySQL and Planet PHP. I am not sure Planet PHP has picked it up yet. This is sort of a test post to see if it shows up. If not and you have contact with the Planet PHP guys (I have emailed and resubmitted my feed) can you let them know that my feed pickup URL is http://doughboy.wordpress.com/category/php/feed/? Thanks.
MySQL Conference Submissions
Well, it seems to be the thing to do to talk about the things you submitted to MySQL Conference. So, I figured I would share. I submitted 3 topics.
From one server to a cluster
In the last 10 years, dealnews.com has grown from a single shared hosting account to an entire rack of equipment. Luckily, we started using PHP and MySQL very early in the company’s history.
From the early days of growing a forum to surviving _Slashdotting_, _Digging_ and even a Yahoo! front page mention, we have had to adapt both our hardware and software many times to keep up with the growth.
I will discuss the traps, bottlenecks, and even some big wins we have encountered along the way using PHP and MySQL. From the small scale to using replication and even some MySQL Cluster we have done many interesting things to give our readers (and our content team) a good experience when using our web site.
MySQL hacks and tricks to make Phorum fast
Phorum is the message board software used by MySQL. One reason they chose Phorum was because of its speed. We have to use some tricks and fancy SQL to make this happen. Things we will talk about in this session include:
* Using temporary tables for good uses.
* Why PHP and MySQL can be a bad mix with large data sets.
* What mysqlnd will bring to the table with the future of PHP and MYSQL.
* How Phorum uses full text indexing and some fancy SQL to make our search engine fast.
* Forcing MySQL to use indexes to ensure proper query performance.Perils of distributed software
Ever wrote software that has to be installed on n + infinity environments? Phorum is such software. The Phorum team has been developing for 10 years. The vast number of configurations and wide range of user experience has shaped how we have written Phorum. I will share some of the PHP, MySQL and user issues we have seen and how we have tried to work around them.
Wish me luck. Maybe you can get the full version of one or all of thse in April.
Integer conversion issue.
So, there are some issues with pack and upack when dealing with unsigned 32-bit integers. The docs state that formats L, V and N are all unsigned numbers. However, on 32-bit PHP, there is no such thing as an unsigned 32-bit number. So, to get around it there you need to do something like this.
$strInt = sprintf("%u", $int);
That gives you a string of the unsigned integer. Now, you can’t bit shift it or do any numeric operations on it. But, you can use the bcmath functions with it and use it in output. So, it is quite useful.
However, the real problem for me is coming on 64-bit machines. More and more of our servers are 64-bit. On 64-bit machines, the above code gives you some odd 64-bit unsigned integer. Not the 32-bit integer that pack/unpack was supposed to return. So, now, how do I convert a 32-bit wrapped signed integer to a 64-bit unsigned integer? Well, I don’t know. I am hoping one of you as the answer.
FWIW, there is an active bug report on the pack/unpack issue. I tried looking through the code, but my C is just not up to snuff enough. So, to whoever ends up fixing this, thanks. I really appreciate it.
Putting files into a database
So, once again, I was listening to the Pro::PHP Podcast (or is it Newscast? guys?). They were talking about putting files in to a database table. Now, most people will say you should never do this. And lots of time they are right. And once upon a time I agreed with them without question. Then I started living in the real world where sometimes you have to do things you never thought you would. Here are the two places where I stores files in a database.
Phorum
Ever wrote software that has to be installed on n + infinity environments. It really sucks. Phorum version 3.3 or 3.4 was the first to allow attachments. We put them on disk. The instructions for setting up a spot on disk for attachments was longer than all the rest of the install document. The support questions were even worse. And there there were the people that left their old host and didn’t take the files with them. Just the database. DOH! So, in Phorum 5 we decided that for ease of use, there would be no need for Phorum to write to disk. Well, we tried really hard anyway. Its almost true. We only write cache data to disk. No config is written by the applicaiton and no permanent storage is written to disk. Phorum 5.2 (near beta) has a new file storage system that allows for modules to be written to store the data wherever you want it. There is already a disk storage module. But, already there are people asking really naive questions about it. Probably because someone told them to never store files in a table.
Replication
At dealnews we have a lot of images. Every deal gets its own image. We do 100+ deals a day. So, getting those images distributed to all the servers is a task. A deal could go from idea to live in 2 minutes. So, we decided to store the images in the database. 3 versions of the image in fact. With a little mod_rewrite magic we can pull the image from the DB and put it on disk on a front end server if its not on the server already. From that point forward the file is served off of disk. So, its a one time db hit per image per server. Not that big of a deal. We use a CDN now too, so it’s really not a big deal. We could probably skip the on disk part all together. Backups are easy. All the attributes of a deal are in the database. Not some in the db and some on some disk. We can just import the db to a test machine and have a fully functioning set of data to work with.
New type of content on dealnews.com
So, dealnews has had a pretty tried and true content format for years now. But, we decided it was time to try something new. I feel it helps to separate us from the other “deal sites” out there. We have started posting feature articles about price trends, rumors in the tech product market and all sorts of other things. Now, I say this is new, but in fact, this is what made dealmac, our original site, famous back in 1997. Our CEO Dan deGrandpre wrote several feature articles about Mac pricing. Those articles were picked up all over the Mac community. So, in some ways, we are back to the future.
One particularly interesting series is the dealpad. We are renting an apartment in Brooklyn, NY for our employees to use when in town at our new NY office. Its cheaper than hotels according to our research. But, we need to furnish it. That is what the features are about. Furnishing the dealpad with stuff we list on the site. We practice what we preach.
My editor of choice
So, I was listening to the Pro PHP Podcast on the way home from work today. They were talking about Komodo a lot. I figured I would give my favorite editor a plug. Believe it or not, it’s jEdit.
I keep trying all the latest and greatest editors out there. I fought with Eclipse and have tried the newer more PHP centric offerings built on Eclipse. I recently tried out Komodo Edit for a week. I had tried the Komodo IDE when it came out for Mac a while back. But, I just keep coming back to jEdit.
What I like about it
The main thing that I like about jEdit over the other top contenders of the new generation is that it has a simple file browser. It does not have the concept of “projects”. Eclipse and Komodo both have these concepts. But, when I really got to looking at the projects in Komodo, you basically set a point in your filesystem and tell it that everything in this dir is Project Foo. So, really, you have to have your code organized on disk anyway. It also bugged me (in Komodo Edit at least) that my project file had to live in the same dir with my project’s code. That just seemed awkward. Not everyone that shares my SVN is gonna want that and its gonna be sitting there in my svn status as an unknown file.
Another thing I like about jEdit is the rather large plugin repository. Now, it’s an older project, so that is something that you would hope any established application would have. But, if I am thinking about switching today, I have to give the nod to jEdit here. The list is a bit Java-centric of course. It’s a Java application after all. But, there are some good ones in there like a PHP code structure browser. I can’t live without that. Makes finding functions or methods really easy in large libraries.
What I don’t like
Its Java so its not quite like working with a native application. The dialogs are funny and the UI is just a bit off even with the Mac plugin that makes it more Mac looking. Having said that, I don’t want a truly “Mac like” editor. BBEdit and XCode are not my kind of editors. I like tabbed interfaces vs. multi windowed UIs.
Its not an IDE, its an editor. There is no debugging, at least, not easily. There looks to be some ability to hook in debugging tools, but I have not gone through the trouble. Of course, that could be said of many of the IDEs out there. PHP has never had the ease of debugging that say Visual Basic had (still has?) back in 1998 when that was my full time job. That was one thing about VB I loved. The language was “eh”. But the IDE was really nice.
Things I don’t care about that you might
jEdit does not have an SVN plugin that I can find. I like my command line. I know one coworker is addicted to the Eclipse real time SVN diff highlighting. There is a CVS plugin, but I don’t know how good it is. I am not aware of any PHP code completion, but it may be there. I have an odd knack for remembering stuff like that and those little pop ups just annoy me. Oh, and did I mention its Java? That put me off for a long time. But, it won me over.
ForceType for nice URLs with PHP
This has been covered before, but I was just setting up a new force type on our servers and thought I would mention it for the fun of it. You see lots of stuff about using mod_rewrite to make friendly URLs or SEO friendly URLs. But, if you are using PHP (and I guess other Apache modules) you can do it without mod_rewrite. We have been doing this for a while at dealnews. Even before SEO was an issue.
Setting up Apache
From the docs, the ForceType directive “forces all matching files to be served as the content type given by media type.” Here is an example configuration:
<Location /deals>
ForceType application/x-httpd-php
</Location>
Now any URL like http://dealnews.com/deals/Cubicle-Warfare/186443.html will attempt to run a file called deals that is in your document root.
Making the script
First save a file called deals witout the .php extension. Modern editors will look for the <?php tag at the first and will color it right. Normally you take input to your PHP scripts with the $_SERVER["QUERY_STRING"] or the $_GET variables. But, in this case, those are not filled by the URL above. They will still be filled if there is a query string, but the path part is not included. We need to use $_SERVER["PATH_INFO"]. In the case above, $_SERVER["PATH_INFO"] will be filled with /Cubicle-Warfare/186443.html. So, you will have to parse the data yourself. In my case, all I need is the numeric ID toward the end.
$id = (int)basename($_SERVER["PATH_INFO"]);
Now I have an id that I can use to query a database or whatever to get my content.
Avoid “duplicate content”
The bad part of my use case is that any URL that starts with /deals/ and ends in 186443.html will work. So, now we have duplicate content on our site. You may have a more exact URL pattern and not have this issue. But, to work around this in my case, we should verify that the $_SERVER["PATH_INFO"] is the proper data for the content requested. This code will vary depending on your URLs. In my code, I generate the URL for the content and see if it matches. Kind of a reverse lookup on the URI. If it does not match, I issue a 301 redirect to the proper location.
header(”HTTP/1.1 301 Moved Permanently”);
header(”Location: $new_url”);
exit();
Returning 404
Now, you have to be careful to always return meaningful data when using this technique. Search engines won’t like you if you return status 200 for every possible random URL that falls under /deals. I know that Yahoo! will put random things on your URLs to see if you are doing the right thing. So, if you get your id and decide this is not a valid URL, you can return a 404. In my case, I have a 404 file in my document root. So, I just send the proper headers and include my regular 404 page.
header('HTTP/1.1 404 Not Found');
header(’Status: 404 Not Found’);
include $_SERVER["DOCUMENT_ROOT"].”/404.html”;
exit();