10.29.05
Baby mania
So, most who read this also know that I have a baby on the way. Sofia Renee Petersen is due December 15th, and we are very excited to finally meet her. This post is actually about another baby funny enough; my mother-in-law, sister-in-law and nephew came to town to visit us this weekend and I have had so much fun with this little man. He is eight months old and way too wickedly cute; this baby doesn’t stop smiling, posing, cooing and gurgling happily.Â
Babies rock.
10.28.05
Forbes evil? Blogs evil?
Forbes magazine features a cover-story about how blogs are evil and people who participate in the discussions on blogs are an evil lynch-mob… I won’t go into detail, rather I’ll just link to their article… which gets more people reading their information which I’m sure is in part the goal of this whole thing. In brief, I tend to agree with this Warner Crocker post.
So, the reason blogs are evil? People can say bad things about others… these things can be completely unsubstantiated. Could this not happen in the pre-blog world? Yes, it could, and did. I agree that some people misuse the tools at their disposal, but in the ever-popular pro-gun argument, it is not the tool that is evil or kills; it is the ill-guided user of the tool that does damage. I wish there were a way to make an honest call to bloggers, asking them to at least think and maybe even do a small amount of research before you swallow everything you hear hook-line-and-sinker. Does that mean I agree with Forbes? In a way I do… I don’t agree with their handling of the issue or the sensational way they chose to cover it, but I do wish more people would act credibly in the blog world. That does not, however, mean that I think they should be forced to by their providers, which is what Forbes is calling for.
So, in essence, I disagree with Forbes’ main point about needing people like Google to prosecute or at least turn over the names of people who post things that are in disagreement with corporate America. I also think Forbes messed up some of the facts when it said that Google makes a concerted effort to go out and shut down ’splogs’ because they skew page-rank. If you look at any discussions on predictive market analysis gathered from blogs, forums, etc, you will find many complain that Google ignores those ’splogs’ because of the ad-hits that they generate. I think probably there is truth in both sides of this, as in most cases.
I guess in general, I’m disappointed that Forbes actually published an article that condemns the freedoms of speech and press that are provided by the constitution, and upon which Forbes’ main business (magazine production/sales) is based. I don’t think their coverage was 100% incorrect or evil, but I think it was not really a ‘proper’ way to cover it. I am sure there will be many blogs out there calling for blood on this one, but not me. I have another question though…
How does this sort of article affect business based upon blogs, forums, etc and the content generated on them? Infoseek was noted in the article, but there are other companies such as BuzzMetrics and Umbria and I wonder if this is seen as a good or bad occurrence for them. If you take the ‘there is no bad publicity’ approach, then any time that blogs in general spend in the spotlight is good. However, do stories such as these decrease the market’s trust of blogs and perhaps their interest in viewing the trends? Or maybe these sorts of stories heighten the awareness that people ARE out there talking about your products, and maybe you need to tap into that? Interesting times and interesting thoughts make for tired mornings and distracted work.
10.27.05
Job Thoughts
This is in no way stating that I am in the following position, that I plan to be in the following position or that the following position is in any way shape or form related to me; this is purely a curiosity that has struck me recently.
Let’s say you have three jobs you’re considering. Let’s go so far as to say that you are employed at job #1, and that jobs #2 and #3 are being offered. What should you consider in making your decision and depending upon your current situation (single, married with children, somewhat lame superhero, etc) which job would be the ‘best’?
Â
 Job #1 => Very stable. The job offers slightly below-average pay for the industry but the job is stable, there is SOME growth opportunity in the 5+ year range, the benefits are truly the most amazing part of the job. Education benefits, good retirement investments, and an incredibly relaxed environment are all part of the package. The downsides are the average pay and the lack of exciting opportunities.
Job #2 => HUGE corporation. One of the biggest corporations in your field offers the ability to change jobs within the company and a large assortment of incredibly exciting projects upon which you could work. The job could be unstable simply because of large corporation cutbacks and the chance of failure under fire. The upsides would be likely a good benefits package, moderate to good pay, stability if you really work and network (a.k.a. kiss-butt) well, and a variety of interesting things to work on. Downsides would be the possible lack of stability, the much higher-stress work environment, and you still run the possibility of being put into a task group that you don’t enjoy.
Job #3 => Explosively growing startup. Instability is the keyword here. If that factor were taken out of the equation, this would be a no-brainer. A rather high salary, potentially amazing benefits (stock options of Google, anyone?), and lots of innovative, high-energy, very interesting work going on are the upsides. On the downside of things, your job could be gone tomorrow… or next week… or next month… or maybe they’ll sell out to another company and you’ll end up working for the big company anyway, but be treated as a second-class citizen.
Â
So given your particular situation, which would you choose? What other factors am I not considering here? Are there other glaring upsides or downsides to any of the jobs as described? I’m guessing most single people will probably say the startup sounds great. I’m guessing most family-having people would choose one of the other two. Does it change your mind any if I say that the startup has been around for 2-3 years, has already sold its ‘product’ to several Fortune 500’s, and that they’ve been successful enough to inspire a second round of VC investment of nearly $7 million? Does it change anything if job number one is for a private university? Does it change anything if the mega-huge corporation is Microsoft? Does it change anything if you HAVE job #1, but haven’t really even had official offers for either of the other two yet? Just looking for some thoughts here… any help?
Update on the Linux Install
Okay, so I thought everything was going great with Red Hat 9, and indeed, for what it did, it was great. But I wanted to use the ndiswrapper package in order to use my Linksys Wireless USB 11 stick. Well, as it turns out, RH9 was using the 2.4.somethinglessthanIneeded kernel. Basically, I was going to have to update the kernel; this is not something I like doing, but not something I’m entirely unfamiliar with. But basically, I didn’t want to nuke the whole install just because my NIC wasn’t working, but I wanted to get it working… so I updated to Fedora Core 3 which took just about as long as a from-scratch install. But anyway, once that was done, I now had a new enough kernel, my XWindows running in full screen, and it boots to prompt (runlevel 3) instead of booting graphically, so I’m happy… sorta.
So I get the Win32 drivers for my NIC, and the ndiswrapper, and after much toiling and trouble with it (mostly my own fault for not reading things completely) I get the NIC installed, using the ndiswrapper, and Linux sees it and there are no real problems… except for the prevaling problem that I STILL cannot get DHCP. So, this gets me thinking because now I’ve had like 4 linux installs with two different network cards not getting DHCP to work. So I check my Windows machines and find that if I reboot with DHCP enabled, they don’t get leases either. GRRRR! You are KIDDING me! So I unplug/restart my router and my dsl modem (just to be thorough) and sure enough when it comes back up, my Windows machine instantly grabs DHCP, I change the wireless NIC on my now FC3 box to use DHCP and oila, I have IP address and internet connectivity. I take the laptop upstairs trimphantly (the battery doesn’t work, so I shut it down, unplug it, take it upstairs, plug it in, boot it up, wlan0 comes up with an ip address, I ping google.com, of course it comes back just fine and I excitedly issue the ’startx’ command. XWindows comes up, gaim autoloads because I told it to, and I get my contact lists and such. Now, I try to open a web-browser to start … browsing the web, of course. Firefox takes forever to even open up (I mean, I get an entry down in the taskbar, but no visual in the screen area… finally it comes up, but can’t get to the webpage… I run a terminal, which sits in the taskbar forever and never actually shows up… Any program I try to run, nothing will open or execute correctly, my gAim shows me as online but I cannot ACTUALLY message anyone, even my wife sitting in the chair next to me… I try to log out of xWindows but it’s not letting me, so I CTRL+ALT+BACKSPACE my way out, which shows all kinds of crashing errors, I get to the prompt, try to ping google.com again, and this time, it can’t get there… wtf happened? I tried this 3 times in a row before deciding not to give it any more thought and instead to just sulk about the house the rest of the night feeling defeated.
I WILL get this working, and I don’t want to re-install some other variant of Linux at this point because I think I’m very close… any thoughts on what would cause an ndiswrapper’d wireless NIC to work before entering X but slowly degrade to the point of no longer working?
10.25.05
Linux installs
So I have a Dell Inspiron 4150 laptop that I bought for my semester in Germany back in 2002-2003. I have, I believe, put Linux on this laptop several times, but I always end up removing it because of various frustrations. I have, since the Zend/PHP conference, become a little more convinced that I WILL make it work… I don’t need some crap Alienware pre-fabbed laptop, I can do this on my own. Now, of note are that the pcmcia slots are bent beyond any hope of use by a gravity-induced, floor-terminated trip started by careless placement of the laptop on a table. Also, in a similar accident, the USB port became … ‘wobbly’ I like to call it. So, by this, I intend to let you know that my interface options are somewhat limited.
This laptop has an onboard NIC, so network should not be a problem, an ATI RADEON 7500C 32mb card (rather generic/common) so that should not be a problem, a sound card that I’ve had working in Linux before so we’re good there, and overall just some mid-old age hardware that all should work with Linux… so I’m super-geeked and convinced that this is SOOOO gonna happen (This is Sunday night btw).
First try – Fedora Core 3 :: The OS goes on, though the loading process is somewhat lengthy, but it DOES go on, and by the end, I have a working system (other than the NIC) that boots straight into XWindows… I don’t want it to go straight into XWindows. Also, the system feels VERY slow and cumbersome. I go through the XWindow tools for network management and eventually I get the network card working correctly by assiging it a static IP address, and telling it what nameservers to use. Not the ‘preferred’ method of function, but it’ll do… for a minute. It eats at me for a whole night.
Next try – Debian Sarge :: So, I get on Debian’s site and download the floppy-disk install images, write them to disk using winrwwrt, boot up and see an error message.. after several attempts, I realize that the floppy is close to going bad and because of the low-level reading nature required of the ‘boot’ floppy, this will not do. So, I swap the root and boot floppies, rewrite the image and now all is good in the hood. I boot up and I see the ever-so-friendly and very familiar Debian installer. I go through the installer a bit, and eventually get to the point that it’s supposed to hit the Debian mirrors and pull stuff down… I realize I’ve not got DHCP working again, so I reboot, go into the expert install, set the IP address, set the DNS, now I get access to the mirrors and it pulls down the base Debian system, installs it, and reboots. At reboot I now re-enter the Debian installer, again without DHCP address, and this time however, I cannot find ANYWHERE to manually set the network configuration. Because of that, I save the settings and end up with a VERY basic Debian install. I search the internet and hack around in the files until I find the places to manually set up my network card (/etc/network/interfaces or something like that) and where to set the dns server info (in /etc again somewhere). I do so, and now I can ping google… yay! So, I apt-get install x-window-system and BOOM… it works like I remember Debian working… it’s friggin gorgeous… By the way, I haven’t mentioned that when booting into the terminal like I just did, any of the Linux distros always utilize only like the center 40% of my LCD, so I have this tiny box in the middle of my screen with the info on it, but I can see that X is installing. I test the X install afterwards, and it’s working… awesome. Now I apt-get install kde and like 20 minutes later, KDE is downloaded and installed. This is beautiful. I log into X and there’s KDE… however, it’s still only using the small center of my screen… very annoying. So I reboot… Same thing. I figure well, at least DHCP is working now I’m sure, so I use XWindows to reconfigure the NIC for DHCP, do a dhclient, and nothing… reboot… nothing. Furious, I turn the machine off; I give up.
Next Try – Red Hat 9 :: Okay, I love RH9, some systems are just CLASSIC. The system goes on quickly, I choose to hand pick every single thing that is installed and I do so, I ask it to boot into text mode instead of graphical, everything installs, it uses the full screen… so now everything is just absolutely awesome… DHCP still doesn’t work. As a note, at ONE point in time, DHCP DID work because I used this laptop in lotsa different DHCP-only environments. I’m still not sure what’s going on but I’ve decided to do two things.
I’ve decide first of all that I’m going to stick with RH9, I’m going to get the Windows driver for my USB-wireless (hope my USB works happily for a bit) network stick, and get the NDISwrapper and go that route. Hopefully my USB-stick will be able to pull DHCP plus I’ll be able to play wirelessly so I can work from upstairs where my family stays.
Secondly, I am going to dual-boot my computer at work. This may seem unrelated, but I want to try one of the heavier installations (Fedora Core 4 is burning as we speak) and this computer has all the nice new stuff. I have a 3.2ghz, 1gb RAM, 100GB HD, svideo out to a nice monitor, etc… I have not really had a chance to use Linux on a really buff machine ever… I always dedicate yesterday’s trash to Linux. Well, if I can get OpenOffice (um, yea), Firefox/Thunderbird (well duh), Gaim (ya think?), Zend Studio Pro (it works) and Meeting Maker (the only possible challenge) to run on FC4, I will most likely attempt, in a month or two, to format this Windows machine, and go full-out FC4.
I don’t hate Windows honestly, but all the software I use on any regular basis can and does run on Linux, plus I’ve been wanting to learn the ins and outs more for a long time, and I think jumping off the deep-end may be the only way to achieve the competency I want. Plus, I see RHCE (Red Hat Certified Engineer) and perhaps RHCT (RHC Technician) in the future for me possibly and if so, I want to get prepared. Finally, our department is moving towards Linux and the LAMP (Linux, Apache, MySQL, PHP/Perl/Python) stack for development, and I think it would be of great use for me to be more familiar.
So anyway, that’s the update for now… hopefully tonight I’ll get RH9 with the USB wireless going on DHCP, and then maybe tomorrow or sometime later this week I’ll finish burning FC4, and dual-boot my work system and see how that goes.
Zend Conference Entries
Okay, so I apologize for the Zend Conference entries. Most, if not all, of them are going to be very hard for anyone else (and even myself in some cases) to read due to the fact that I was just typing them while listening to the presentation and reading from the PowerPoint slides. In some of the notes, it’s pretty clear that I mostly just captured the PowerPoint presentation but in other cases, almost everything I typed was just what was being said. I have not gone back over any of them to try to edit them into the realm of being readable but I hope to maybe do that eventually. Also there WILL be a couple other posts as I get time to post what I have captured on paper. I hope, if nothing else, you enjoy reading over what little note-taking I was able to do.
Intel VT Chipset
Intel has unveiled new microprocessor chipsets that will be referred to as “VTâ€? chips; I’m sure they’ve been talking about it for a while, this is just the first chance I’ve had to see it. VT stands for Virtual Technologies. This is a new technology that allows for hardware native support for non-emulated, reboot-free multiple operation installs. This means that if you purchase a system built around an Intel 3.8GHZ VT chip, you will be able to install, using a 3rd party layer like XenSource and some others that are on their way from such vendors as Microsoft and VMWare that will allow you to install a host operating system (with XenSource, Linux works as the host and with the Windows solution undoubtedly some Windows product will have to be used as the host) and multiple other systems as well. The real excitement comes from what is actually going on under the hood which I am certain I’m not totally qualified to discuss in detail but even the generalities make my extremities tingle. The two operating systems are 100% completely independent of each other and in fact, thanks to the hardware from Intel, they are not even aware of each others’ existence. I think there are good points to this as well as bad points but I’m not decided upon whether I think the complete separation is mostly a good or mostly a bad thing.
Good points include the possibility of being able to use one operating system to repair the files on disk of the other operating system in case of corruption causing inability to boot, and the fact that upgrading or getting rid of either operating system should not damage the other installation at all. The bad points that I can think of are limited to questions of shared memory. I would want the ability to transfer my cut and paste / clipboard buffer between the operating systems. This may seem like a small thing, but in function, it would be an exceedingly useful feature. So many times, when working with a dual boot system, two computers, or with a kvm, I have done something on one computer, whether it’s just browse to a website or type a paragraph and found for some reason that the information in my clipboard would much better serve me on the other machine. I’m not certain that this would be a feature ‘required’ by the industry, but it would be nice as a new feature in a later revision if not in the initial release.
Hardware Layouts for LAMP Installations
You have your apache/php server, and it sends SELECT statements to slave db servers, and the insert/update/delete traffic goes to your master db server. The MYSQL replication handles the splitting out of the IUD vs SELECT stuff.
Slave lag is when the user maybe posts a comment to a forum, so on the master server, an insert has been done… however that hasn’t replicated so when they are instantly redirected back to the thread, the post does not show up when it does a SELECT against the slave servers.
How is it really done? Put a load balancer between the apache/php, this distributes the traffic across the slaves correctly. This way you can remove/add slaves and it never affects the service. Another side effect is you get another monitoring point, because the load balancer knows exactly how many sessions are coming in and going out. You can treat your entire slave pool as one resource, and makes capacity planning a lot easier if you know teh ceiling of each slave.
How do you know the ceiling? First make a good guess based upon other peoples’ benchmarks. Then get more machines than you need. In production, during a lull in traffic, remove machines from the pool… when the slaves begin to bog down/lag, you know you’ve found your ceiling. The QPS you saw right before slave lag set in: THAT is your ceiling.
What can be bad/tough about load balancing? Not all load-balancers are created equal, not all companies expect this product use so support may still be thin. Not that many people are doing it in high-volume situations yet, so support from the community isn’t large either. Gotchas are port exhaustion (this is about TCP ports… you only have a limited number of them, then 1024 of those are reserved, each is forced to stick around for 120 seconds or so in timed-wait, etc…. you run out of ports quickly. You will have 535 concurrent max connections per ip… you can use a pool of ips on the database slave/farm side, sometimes referred as subnet IP’s or PiPs), health checks (LB won’t know anything about how well each MySQL slave is doing, and will pass traffic as long as …. he skipped slide. Solutions: Have each server monitor itself, and shut off/firewall its own 3306 port, even if MySQL is running (using NAGIOS)… dirty but workable… Cleaner would be to have each server monitor itself and run a check via xinetd… the LB checks every X seconds, and if it doesn’t get a response from any given slave it does…. then he skipped that slide as well…) and balance algorithms (Load balancers know http, ftp, basic tcp but not sql… two things to care about…. should the server still be in the pool? (health check) how should load get balanced? least connect/least load, bad. round-robin or random selection of the servers is best. If you don’t use it, you get “evil favortism”. Don’t have round-robin try to figure out the health of any server, just hand off the queries as each server comes up).
In the box considerations:
Interleaving memory *DOES* make a difference.
ALWAYS RAID10 (or RAID0 if you’re crazy) but NEVER RAID5 (for Innodb, anyway)
RAID10 has much more read capacity, and a write penalty, but not as much as RAID5
Always have battery backup for HW RAID write caching
or, don’t use write caching at all…
Always have proper monitoring (nagios, etc) for failed/rebuilding drive
SATA or SCSI? SCSI! It’s worth it!
10k or 15k RPM SCSI? 15k! It’s worth it! (~20% performance increase when you’re disk bound… at flickr, this was the difference between having or not having slave lag)
For 64bit Linux (AMD64 or EM64T)
Crank up the RAM for InnoDB’s buffer pool
Swapping – very very bad either:
turn it off (slightly scary?)
leave it on and set /proc/sys/vm/swapiness = 0
Does 10k REALLY matter versus 15k? Some chunk of records were deleted, and propogated down… the 10k drives just did not catch up at all… period…. ever… the 15k drives caught up with slave lag in 10-15 minutes.
MYSQL with a SAN
- DO layout storage same as if they would be local
- DO make sure that HBA (fiber card) driver is well supported by linux
- DON’T share volumes across databases
- DON’T forget to corrrectly tune Queue Depth Size, which should be increasing, from server HBA -> switch -> storage.
Caching your static content
SQUID = good
Relieve your front-end PHP machines from looking up data that will never (or rarely) change
Generate static pages, and cache them in squid, along with your images
Use SQUID to accelerate plain-old origin webservers, also known as “reverse proxy” HTTP acceleration as described here and elsewhere: www.squid-cache.org/Doc/FAQ/FAQ-20.html
HTTP req -> SQUID -> Apache -> Storage
Good HW layout for high-volume SQUIDDING
Do use SCSI and many spindles for disk cach dirs
Don’t use RAID
Do use network attached storage, or place the origin servers on separate machines.
Do use ext3 with noatime for disk cache dirs
Do monitor squid stats
FLICKR Style, 2gb RAM for caching, DISK, 5-10GB per spindle, dev/sda1..dev/sda6. Caching policy,
SQUID Stats for Flickr:
>2800 images / second, ~75-80% are cache hits
~10 million photos cached at any time
1.5 million cached in memory
Business and Legal Issues Associated with Open Source
Business Case:
- Accelerated time to deployment.
- Open source facilitates true adoption of standards.
- Code transparency
- Lower costs
- Open source challenges the proprietary business models
Challenges of Adopting OSS
- Evolving supplier business models
- Confusing licensing landscape
- Lack of availability of skilled resources
- Demand for developer mindshare outstripping availability
- Too many choices
- Competitive FUD
Community Involvment is Very Important
- Sourceforge
- FSF (free software foundation) — provided the GPL
- OSI (open source initiative) — approves licenses as OS or not.
- OSDL (open source development lab)
- The developers, the ‘traditional (more related to IT)’, ‘professional (more theoretical people)’, or ‘consumer (using the OS solutions to build their own applications but may not be hugely involved in the actual OS community)’.
Adopting OSS
- Document your needs, use cases, etc
- Use sourceforge
- Review the project description, licenses, etc
- Examine the community. How will you handle mailing lists, how many developers, FAQs, documentation, bug reports and tracking, user groups, etc.
- Examine the alternatives for commercial support.
Legal Case:
- What can you do with the open source software? Licenses are critical. Make sure you choose the correct license for your project.
- How can you combine it with other software.
- Potential for litigation.
Licenses:
- Types of Licenses
- CopyLeft (viral): GPL
- Notice license: New BSD, Apache
- Reciprocal: MPL, CDDL, OSL
- The PHP License
- Distribution Obligations / Restrictions — what must you, or must you not do, when distributing applications that are licensed with a given license, or that contains code protected by those licenses.
- Notice Requirements
- Source Disclosure Obligations
- Marketing Attribution Obligations
PHP License and Usage
- Name limits (can’t call your product PHP)
- Disclaimers
- Copyright notice
- License versions
- May be used in proprietary products
- It is NOT compatible with GPL
Potential consequences for Violation of OSL’s
- Copyright infringement actions
- Negative Publicity (one of the strongest weapons available to the OS community is the Internet)
- Possible Monetary problems (costly delays of launch, or total recall, expensive redundant dev efforts, and restricted commercialization and lost profit)
- Potential enforcement rights for ever contributor
- Automatic termination of some licenses
Limitations on Use of Third Party Code
- Just because you can download it doesn’t mean you can use it
- All software code is subject to copyright protection (maybe patent protection too)
Use of OSS
- Whether in infrastructure or product every company should have an open source (or third party software) Policy.
- Inventory what has already entered the company
- Develop policy for handling third party software
- Develop effective procedures for implementing the policy
- Educate employees about the policy and procedures.
Myths
- You cannot use OSS in a proprietary environment.
- All os licenses require the release of source for everything.
- Just say no is easiest.
- None of these agreements are enforceable so it doesn’t really matter.
- No one will ever know.
- If I ever begin to think about all of these obligations, I will give up. To survive you have to accept some risks and just move on.
Conclusions
- Open source is fundamentally disruptive in the enterprise market.
- Open source is becoming pervasive
- Consumer and vendors of software should have an open source strategy
- Some critical issues remain uncertain
What is PEAR?
PEAR has been around since 1999 and therefore it is rather stable by now. All components are not published under GPL, only LGPL, Apache, PHP, etc licenses. This means that if you build a commercial application, you can publish with the PEAR components included without worrying about what licenses are included.
PEAR has a PEAR installer that can be used for building installs for PEAR and PECL (C extensions, pecl.php.net). The installer works on all major operating systems, and has different GUIs available (console, Web, GTK). It handles dependencies and provides tools for developers as well. The new installer works much like apt-get from Debian. You can set up a PEAR channel that is basically like the PEAR repository. You can set up the installer to pull in external non-PEAR dependencies which will be referred to by the URLS. The 1.4 PEAR can run post-install scripts to do things like set up databases, moves files around, etc.
PHAR support… PHAR is the PHP equivalent to JAR…. you can package all your php files, everything into one tar.gz – similar file. You simply double-click/run the file and it executes your PHP application. PHAR is supported in the new 1.4 PEAR. It now uses a package.xml file to specify the various needed pieces of information for your application/component. Multiple module support allows you to bundle whatever modules your application needs along with the install. It can gracefully handle upgrades of existant packages, or base installation if the module does not already exist.
pearified.com – phpMyAdmin, serendipity, and other popular tools are up on this site.
pear.php-tools.com – PAT tools available.