<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
 <title>Huy Nguyen</title>
 <link href="http://www.huyng.com/atom.xml" rel="self"/>
 <link href="http://www.huyng.com/"/>
 <updated>2012-07-29T16:40:07-07:00</updated>
 <id>http://www.huyng.com/</id>
 <author>
   <name>Huy Nguyen</name>
   <email>huy@huyng.com</email>
 </author>

 
 <entry>
   <title>A guide to analyzing Python performance</title>
   <link href="http://www.huyng.com/posts/python-performance-analysis"/>
   <updated>2012-07-19T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/python-performance-analysis</id>
   
   <category term="programming" />
   
   <category term="python" />
   

   <content type="html">&lt;p&gt;While it’s not always the case that every Python program you write will require a rigorous performance analysis, it is reassuring to know that there are a wide variety of tools in Python’s ecosystem that one can turn to when the time arises.&lt;/p&gt;

&lt;p&gt;Analyzing a program’s performance boils down to answering 4 basic questions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;How fast is it running?&lt;/li&gt;
  &lt;li&gt;Where are the speed bottlenecks?&lt;/li&gt;
  &lt;li&gt;How much memory is it using?&lt;/li&gt;
  &lt;li&gt;Where is memory leaking? &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Below, we’ll dive into the details of answering these questions using some awesome tools.&lt;/p&gt;

&lt;h3 id=&quot;coarse-grain-timing-with-time&quot;&gt;Coarse grain timing with time&lt;/h3&gt;

&lt;p&gt;Let’s begin by using a quick and dirty method of timing our code: the good old unix utility &lt;code&gt;time&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;python yourprogram.py

real    0m1.028s
user    0m0.001s
sys     0m0.003s
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The meaning between the three output measurements are detailed in this &lt;a href=&quot;http://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1&quot;&gt;stackoverflow article&lt;/a&gt;, but in short&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;real - refers to the actual elasped time&lt;/li&gt;
  &lt;li&gt;user - refers to the amount of cpu time spent outside of kernel&lt;/li&gt;
  &lt;li&gt;sys - refers to the amount of cpu time spent inside kernel specific functions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can get a sense of how many cpu cycles your program used up regardless of other programs running on the system by adding together the &lt;em&gt;sys&lt;/em&gt; and &lt;em&gt;user&lt;/em&gt; times.&lt;/p&gt;

&lt;p&gt;If the sum of &lt;em&gt;sys&lt;/em&gt; and &lt;em&gt;user&lt;/em&gt; times is much less than &lt;em&gt;real&lt;/em&gt; time, then you can guess that most your program’s performance issues are most likely related to IO waits.&lt;/p&gt;

&lt;h3 id=&quot;fine-grain-timing-with-a-timing-context-manager&quot;&gt;Fine grain timing with a timing context manager&lt;/h3&gt;

&lt;p&gt;Our next technique involves direct instrumentation of the code to get access to finer grain timing information. Here’s a small snippet I’ve found invaluable for making ad-hoc timing measurements:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;timer.py&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;time&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;object&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__enter__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__exit__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;secs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msecs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;secs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# millisecs&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;elapsed time: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%f&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; ms&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msecs&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In order to use it, wrap blocks of code that you want to time with Python’s &lt;code&gt;with&lt;/code&gt; keyword and this &lt;code&gt;Timer&lt;/code&gt; context manager. It will take care of starting the timer when your code block begins execution and stopping the timer when your code block ends.&lt;/p&gt;

&lt;p&gt;Here’s an example use of the snippet:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;timer&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Timer&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;redis&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Redis&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;rdb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Redis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Timer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;rdb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lpush&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;foo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;bar&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;=&amp;gt; elasped lpush: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; s&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;secs&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Timer&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;rdb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lpop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;foo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;=&amp;gt; elasped lpop: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; s&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;secs&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;I’ll often log the outputs of these timers to a file in order to see how my program’s performance evolves over time.&lt;/p&gt;

&lt;h3 id=&quot;line-by-line-timing-and-execution-frequency-with-a-profiler&quot;&gt;Line-by-line timing and execution frequency with a profiler&lt;/h3&gt;

&lt;p&gt;Robert Kern has a nice project called &lt;a href=&quot;http://packages.python.org/line_profiler/&quot;&gt;line_profiler&lt;/a&gt; which I often use to see how fast and how often each line of code is running in my scripts.&lt;/p&gt;

&lt;p&gt;To use it, you’ll need to install the python package via pip:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;pip install line_profiler
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Once installed you’ll have access to a new module called “line_profiler” as well as an executable script “kernprof.py”. &lt;/p&gt;

&lt;p&gt;To use this tool, first modify your source code by decorating the function you want to measure with the &lt;code&gt;@profile&lt;/code&gt; decorator. Don’t worry, you don’t have to import anyting in order to use this decorator. The &lt;code&gt;kernprof.py&lt;/code&gt; script  automatically injects it into your script’s runtime during execution.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;primes.py&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@profile&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;primes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; 
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mroot&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;half&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mroot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;half&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;primes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Once you’ve gotten your code setup with the &lt;code&gt;@profile&lt;/code&gt; decorator, use &lt;code&gt;kernprof.py&lt;/code&gt; to run your script.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;kernprof.py -l -v fib.py
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;-l&lt;/code&gt; option tells kernprof to inject the &lt;code&gt;@profile&lt;/code&gt; decorator into your script’s builtins, and &lt;code&gt;-v&lt;/code&gt; tells kernprof to display timing information once you’re script finishes. Here’s one the output should look like for the above script:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Wrote profile results to primes.py.lprof
Timer unit: 1e-06 s

File: primes.py
Function: primes at line 2
Total time: 0.00019 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                           @profile
     3                                           def primes(n): 
     4         1            2      2.0      1.1      if n==2:
     5                                                   return [2]
     6         1            1      1.0      0.5      elif n&amp;lt;2:
     7                                                   return []
     8         1            4      4.0      2.1      s=range(3,n+1,2)
     9         1           10     10.0      5.3      mroot = n ** 0.5
    10         1            2      2.0      1.1      half=(n+1)/2-1
    11         1            1      1.0      0.5      i=0
    12         1            1      1.0      0.5      m=3
    13         5            7      1.4      3.7      while m &amp;lt;= mroot:
    14         4            4      1.0      2.1          if s[i]:
    15         3            4      1.3      2.1              j=(m*m-3)/2
    16         3            4      1.3      2.1              s[j]=0
    17        31           31      1.0     16.3              while j&amp;lt;half:
    18        28           28      1.0     14.7                  s[j]=0
    19        28           29      1.0     15.3                  j+=m
    20         4            4      1.0      2.1          i=i+1
    21         4            4      1.0      2.1          m=2*i+3
    22        50           54      1.1     28.4      return [2]+[x for x in s if x]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Look for lines with a high amount of hits or a high time interval. These are the areas where optimizations can yield the greatest improvements.&lt;/p&gt;

&lt;h3 id=&quot;how-much-memory-does-it-use&quot;&gt;How much memory does it use?&lt;/h3&gt;

&lt;p&gt;Now that we have a good grasp on timing our code, let’s move on to figuring out how much memory our programs are using. Fortunately for us, Fabian Pedregosa has implemented a nice &lt;a href=&quot;https://github.com/fabianp/memory_profiler&quot;&gt;memory profiler&lt;/a&gt; modeled after Robert Kern’s line_profiler. &lt;/p&gt;

&lt;p&gt;First install it via pip:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;pip install -U memory_profiler
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;pip install psutil
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;(Installing the &lt;code&gt;psutil&lt;/code&gt; package here is recommended because it greatly improves the performance of the memory_profiler).&lt;/p&gt;

&lt;p&gt;Like line_profiler, memory_profiler requires that you decorate your function of interest with an &lt;code&gt;@profile&lt;/code&gt; decorator like so:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@profile&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;primes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; 
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;To see how much memory your function uses run the following:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python -m memory_profiler primes.py
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You should see output that looks like this once your program exits:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Filename: primes.py

Line #    Mem usage  Increment   Line Contents
==============================================
     2                           @profile
     3    7.9219 MB  0.0000 MB   def primes(n): 
     4    7.9219 MB  0.0000 MB       if n==2:
     5                                   return [2]
     6    7.9219 MB  0.0000 MB       elif n&amp;lt;2:
     7                                   return []
     8    7.9219 MB  0.0000 MB       s=range(3,n+1,2)
     9    7.9258 MB  0.0039 MB       mroot = n ** 0.5
    10    7.9258 MB  0.0000 MB       half=(n+1)/2-1
    11    7.9258 MB  0.0000 MB       i=0
    12    7.9258 MB  0.0000 MB       m=3
    13    7.9297 MB  0.0039 MB       while m &amp;lt;= mroot:
    14    7.9297 MB  0.0000 MB           if s[i]:
    15    7.9297 MB  0.0000 MB               j=(m*m-3)/2
    16    7.9258 MB -0.0039 MB               s[j]=0
    17    7.9297 MB  0.0039 MB               while j&amp;lt;half:
    18    7.9297 MB  0.0000 MB                   s[j]=0
    19    7.9297 MB  0.0000 MB                   j+=m
    20    7.9297 MB  0.0000 MB           i=i+1
    21    7.9297 MB  0.0000 MB           m=2*i+3
    22    7.9297 MB  0.0000 MB       return [2]+[x for x in s if x]
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;wheres-the-memory-leak&quot;&gt;Where’s the memory leak?&lt;/h3&gt;

&lt;p&gt;The cPython interpreter uses reference counting as it’s main method of keeping track of memory. This means that every object contains a counter, which is incremented when a reference to the object is stored somewhere, and decremented when a reference to it is deleted. When the counter reaches zero, the cPython interpreter knows that the object is no longer in use so it deletes the object and deallocates the occupied memory.&lt;/p&gt;

&lt;p&gt;A memory leak can often occur in your program if references to objects are held even though the object is no longer in use.&lt;/p&gt;

&lt;p&gt;The quickest way to find these “memory leaks” is to use an awesome tool called &lt;a href=&quot;http://mg.pov.lt/objgraph/&quot;&gt;objgraph&lt;/a&gt; written by Marius Gedminas. This tool allows you to see the number of objects in memory and also locate all the different places in your code that hold references to these objects.&lt;/p&gt;

&lt;p&gt;To get started, first install &lt;code&gt;objgraph&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;pip install objgraph
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Once you have this tool installed, insert into your code a statement to invoke the debugger:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pdb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pdb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_trace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h5 id=&quot;which-objects-are-the-most-common&quot;&gt;Which objects are the most common?&lt;/h5&gt;

&lt;p&gt;At run time, you can inspect the top 20 most prevalent objects in your program by running:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(pdb) import objgraph
(pdb) objgraph.show_most_common_types()

MyBigFatObject             20000
tuple                      16938
function                   4310
dict                       2790
wrapper_descriptor         1181
builtin_function_or_method 934
weakref                    764
list                       634
method_descriptor          507
getset_descriptor          451
type                       439
&lt;/code&gt;&lt;/pre&gt;

&lt;h5 id=&quot;which-objects-have-been-added-or-deleted&quot;&gt;Which objects have been added or deleted?&lt;/h5&gt;

&lt;p&gt;We can also see which objects have been added or deleted between two points in time:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(pdb) import objgraph
(pdb) objgraph.show_growth()
.
.
.
(pdb) objgraph.show_growth()   # this only shows objects that has been added or deleted since last show_growth() call

traceback                4        +2
KeyboardInterrupt        1        +1
frame                   24        +1
list                   667        +1
tuple                16969        +1
&lt;/code&gt;&lt;/pre&gt;

&lt;h5 id=&quot;what-is-referencing-this-leaky-object&quot;&gt;What is referencing this leaky object?&lt;/h5&gt;

&lt;p&gt;Continuing down this route, we can also see where references to any given object is being held. Let’s take as an example the simple program below:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;a&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pdb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pdb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_trace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;To see what is holding a reference to the variable &lt;code&gt;x&lt;/code&gt;, run the &lt;code&gt;objgraph.show_backref()&lt;/code&gt; function:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(pdb) import objgraph
(pdb) objgraph.show_backref([x], filename=&quot;/tmp/backrefs.png&quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The output of that command should be a PNG image stored at &lt;code&gt;/tmp/backrefs.png&lt;/code&gt; and it should look something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/media/3011/backrefs.png&quot; alt=&quot;back refrences&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The box at the bottom with red lettering is our object of interest. We can see that it’s referenced by the symbol &lt;code&gt;x&lt;/code&gt; once and by the list &lt;code&gt;y&lt;/code&gt; three times. If &lt;code&gt;x&lt;/code&gt; is the object causing a memory leak, we can use this method to see why it’s not automatically being deallocated by tracking down all of its references.&lt;/p&gt;

&lt;p&gt;So to review, &lt;a href=&quot;http://mg.pov.lt/objgraph/&quot;&gt;objgraph&lt;/a&gt; allows us to:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;show the top N objects occupying our python program’s memory&lt;/li&gt;
  &lt;li&gt;show what objects have been deleted or added over a period of time&lt;/li&gt;
  &lt;li&gt;show all references to a given object in our script&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;effort-vs-precision&quot;&gt;Effort vs precision&lt;/h3&gt;

&lt;p&gt;In this post, I’ve shown you how to use several tools to analyze a python program’s performance. Armed with these tools and techniques you should have all the information required to track down most memory leaks as well as identify speed bottlenecks in a Python program.&lt;/p&gt;

&lt;p&gt;As with many other topics, running a performance analysis means balancing the tradeoffs between effort and precision. When in doubt, implement the simplest solution that will suit your current needs.&lt;/p&gt;

&lt;h5 id=&quot;refrences&quot;&gt;Refrences&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1&quot;&gt;stack overflow - time explained&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://packages.python.org/line_profiler/&quot;&gt;line_profiler&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/fabianp/memory_profiler&quot;&gt;memory_profiler&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://mg.pov.lt/objgraph/&quot;&gt;objgraph&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
   
 </entry>
 
 <entry>
   <title>Modifying Python's SimpleHTTPServer to accept directory aliases</title>
   <link href="http://www.huyng.com/posts/modifying-python-simplehttpserver"/>
   <updated>2012-05-28T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/modifying-python-simplehttpserver</id>
   
   <category term="programming" />
   
   <category term="python" />
   

   <content type="html">&lt;p&gt;I often use python’s SimpleHTTPServer as an adhoc means for 
sharing files. As it turns out it’s also a useful tool for
dynamically viewing the contents of static site generators 
like &lt;a href=&quot;http://jekyllrb.com/&quot;&gt;Jekyll&lt;/a&gt; (which is what this site is built upon).&lt;/p&gt;

&lt;p&gt;The  command below is a quick way for one to spawn 
a webserver that will serve the contents of the 
current directory.  &lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;python -m SimpleHTTPServer
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;And since python comes pre-installed 
on most linux distros and on MacOSX, this is a no-nonsense 
way for serving files.&lt;/p&gt;

&lt;p&gt;No dependencies, no installation steps, it &lt;em&gt;just works&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Given this, there are times however, when I wish it had just one extra piece
of functionality: the ability to route different url prefixes to 
different directories. &lt;/p&gt;

&lt;p&gt;The script below patches a single function in SimpleHTTPServer’s request handler
and makes this possible without adding in any extra dependencies. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;server.py&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;os&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;posixpath&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;urllib&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;BaseHTTPServer&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;SimpleHTTPServer&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SimpleHTTPRequestHandler&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# modify this to add additional routes&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ROUTES&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# [url_prefix ,  directory_path]&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;/media&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;/var/www/media&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;       &lt;span class=&quot;s&quot;&gt;&amp;#39;/var/www/site&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# empty string for the &amp;#39;default&amp;#39; match&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RequestHandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SimpleHTTPRequestHandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;translate_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;translate path given routes&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

        &lt;span class=&quot;c&quot;&gt;# set default root to cwd&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getcwd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        
        &lt;span class=&quot;c&quot;&gt;# look up routes and set root directory accordingly&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rootdir&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ROUTES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;startswith&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                &lt;span class=&quot;c&quot;&gt;# found match!&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):]&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# consume path up to pattern len&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;root&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rootdir&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
        
        &lt;span class=&quot;c&quot;&gt;# normalize path and prepend root directory&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;?&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;#&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;posixpath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;normpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;urllib&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unquote&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        
        &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;drive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;splitdrive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;word&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;head&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;word&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;curdir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pardir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;word&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;BaseHTTPServer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestHandler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BaseHTTPServer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HTTPServer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;While not as convenient as the former solution, you can run this script by simply
invoking &lt;code&gt;python server.py&lt;/code&gt;&lt;/p&gt;

</content>
   
 </entry>
 
 <entry>
   <title>Having great conversations, in spite of being a programmer</title>
   <link href="http://www.huyng.com/posts/having-great-conversations"/>
   <updated>2012-03-21T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/having-great-conversations</id>
   
   <category term="techlife" />
   

   <content type="html">&lt;p&gt;Listening is the most important skill for having great conversations. I’ve had to learn and relearn this skill so many times through out my career. As a programmer, I’ve come to realize that the mode of thinking required to be good at my job and the mode of thinking needed to have great conversations are completely at odds with one another.&lt;/p&gt;

&lt;p&gt;Programming is one of those activities where in order to achieve a high level of productivity you have to shut out the world. You put on blinders and dive into an alternate reality where bit by bit you’re constructing a delicate mental model of the problem at hand.&lt;/p&gt;

&lt;p&gt;On difficult projects, the concepts can be quite slippery to grasp, and ideas often need several iterations of refinement. Some of the best programmers I know revisit a concept dozens of times before it’s ready to implement.&lt;/p&gt;

&lt;p&gt;To a certain extent, I believe all programmers possess this craving to refactor and refine ideas. And it’s this exact same quality that impedes so many of us when it comes time to have conversations with other people.&lt;/p&gt;

&lt;p&gt;Programmers end up missing out a lot on conversations because our minds so readily want to explore and refine the dozens of ideas as they come up. Unfortunately, the tempo and pace of a typical conversation is so rapid that ideas relevant in one moment may be irrelevant in the next. Our tendency to explore and refine leads us &lt;a href=&quot;http://sridattalabs.com/2012/02/06/rabbit-holes-being-smart-hurts-prod/&quot;&gt;down rabbit holes&lt;/a&gt; and before you know it, the other person has already moved on to the next subject. &lt;/p&gt;

&lt;h3 id=&quot;being-explicit-about-switching-modes&quot;&gt;Being explicit about switching modes&lt;/h3&gt;

&lt;p&gt;The trick to overcoming these bad habbits is to be explicit when you’re switching modes between programming and having conversations with people. You have to mentally recognize the change in tempo and that your mode of thinking needs to adapt.&lt;/p&gt;

&lt;p&gt;As a physical trigger, I wiggle my toes during these transitions, and it lets me know that okay … it’s time to shut off my inner voice and actively listen to what the other person is saying. &lt;/p&gt;

&lt;p&gt;Once this is done, the rest of mechanics fall into place. By actively listening, I don’t get distracted by minor details and can truly absorb what the other person is trying to convey. With this understanding I can deliver my comments and questions when they are most relevant.&lt;/p&gt;

&lt;p&gt;I’ve gotten better at having good conversations over the years, and most of that has come through conscious repeated practice of this simple little trick.&lt;/p&gt;

</content>
   
 </entry>
 
 <entry>
   <title>Could testing in Python be simplified?</title>
   <link href="http://www.huyng.com/posts/could-testing-in-python-be-simplified"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/could-testing-in-python-be-simplified</id>
   
   <category term="python" />
   
   <category term="programming" />
   

   <content type="html">&lt;!----&gt;
&lt;p&gt;&lt;strong&gt;Edited on May 29 :&lt;/strong&gt; After writing this article, &lt;a href=&quot;http://www.huyng.com/archives/could-testing-in-python-be-simplified/792/comment-page-1/#comment-9940&quot;&gt;Christian Heimes pointed out to me&lt;/a&gt; that the &lt;code&gt;debug&lt;/code&gt; method on &lt;em&gt;unttest.TestCase&lt;/em&gt; allows you to run tests interactively. The idiot in me  didn't do enough research before writing this rant. I'm leaving this post up to provide some context for future visitors/googlers who are looking for a quick way debug their test cases interactively.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;One lesson I've learned over many years of observation and programming in Python is that the easier I make writing and running tests for my programs, the faster I can produce bug-free software. That is why I start most projects nowadays with two pre-baked files: &lt;em&gt;main.py&lt;/em&gt; and &lt;em&gt;tests.py&lt;/em&gt;. &lt;/p&gt;
&lt;p&gt;With this setup and tools like &lt;a href=&quot;http://somethingaboutorange.com/mrl/projects/nose/1.0.0/&quot;&gt;Nose&lt;/a&gt;, testing code in Python for me is nearly pain free. Recently however, I notice one aspect of the &lt;strong&gt;unittest&lt;/strong&gt; package in Python's standard libary drives me absolutely crazy:&lt;/p&gt;
&lt;p&gt;Why is there no simple method to run a single test case from within an interactive python shell?&lt;/p&gt;
&lt;p&gt;As developers, this is important because if we are going to make writing tests an integral part of
our development workflow, one needs some way to actually run the damn test without dropping out of
our Python sessions and breaking out of our mental flow.&lt;/p&gt;
&lt;p&gt;Being forced to run tests solely from the commandline also means we can't take advantage of features like ipython's ability to automatically &lt;a href=&quot;http://ipython.scipy.org/doc/manual/html/interactive/tutorial.html#debug-a-python-script&quot;&gt;drop into a pdb debugging session&lt;/a&gt; when an error occurs. This is extremely useful for when you want to introspect one or two variables to determine why a test failure occurred.&lt;/p&gt;
&lt;p&gt;In general, it seems like the &lt;em&gt;unittest&lt;/em&gt; library's lack of a simple function to run single test cases in an interactive shell dis-incentivizes programmers from writing tests and detracts from python's whole &quot;rapid iteration&quot; ethos.&lt;/p&gt;
&lt;p&gt;After googling around and failing to find any good alternatives, I've finally settled on writing my own utility function to run single test cases:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;runtest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;testCase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;methodName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot; Runs a test case from within interactive shell &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testCase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;methodName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;setUp&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;methodName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)()&lt;/span&gt;  
    &lt;span class=&quot;k&quot;&gt;finally&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;getattr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;tearDown&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I add this function to every single project that I create now, and the qualitative feel of writing tests just feels so much more natural now. I also find myself writing more tests and running them more often with this function in my projects. &lt;/p&gt;
&lt;p&gt;Give it a try, and let me know whether this has any effect for you. I'd love to hear your feedback on how it affects your development workflow.&lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Textmate hack: A keyboard shortcut for opening files in a new window</title>
   <link href="http://www.huyng.com/posts/textmate-hack-a-keyboard-shortcut-for-opening-files-in-a-new-window"/>
   <updated>2011-05-11T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/textmate-hack-a-keyboard-shortcut-for-opening-files-in-a-new-window</id>
   
   <category term="workflow" />
   

   <content type="html">&lt;!----&gt;
&lt;p&gt;Textmate has no split windows, and you can't open the same file within a project in a new window without resorting to using the mouse. Here's what I mean:&lt;/p&gt;
&lt;center&gt;&lt;img alt=&quot;textmate open in new window&quot; src=&quot;http://i.imgur.com/JAcOs.png&quot; /&gt;&lt;/center&gt;
&lt;p&gt;Situations often arise where you have to refrence one file while editing another. Maybe you want to view a header file while editing its implementation code. Or maybe you want a TODO list in one window while working on the rest of your code base.&lt;/p&gt;
&lt;p&gt;In these scenarios tabbed editing becomes extremely painful, and the alternative of mousing around to open a new window has always felt like a clunky solution to me. Here's a hack to create a keyboard shortcut for opening files in new windows. Oh what a hack it is:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;c&quot;&gt;#!/bin/sh&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;tof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$TM_FILEPATH&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$tof&lt;/span&gt;

mate &lt;span class=&quot;s2&quot;&gt;&amp;quot;$tof&amp;quot;&lt;/span&gt; .RANDOM_STRING_DOES_NOT_MATTER &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; osascript &lt;span class=&quot;s&quot;&gt;&amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;tell application &amp;quot;System Events&amp;quot;&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;    delay 0.2&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;    tell application &amp;quot;TextMate&amp;quot; to activate&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;    keystroke &amp;quot;t&amp;quot; using {command down}&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;    keystroke return&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;    keystroke &amp;quot;d&amp;quot; using {command down, control down, option down}&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;end tell&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;EOF&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To use this, create a new bundle command, add this code to it, and associate your keyboard shortcut to the command. &lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>A poorman's Excel using sqlite, apsw, and csv2sql</title>
   <link href="http://www.huyng.com/posts/a-poormans-excel-using-sqlite-apsw-and-csv2sql"/>
   <updated>2011-04-15T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/a-poormans-excel-using-sqlite-apsw-and-csv2sql</id>
   
   <category term="python" />
   
   <category term="programming" />
   

   <content type="html">&lt;!----&gt;
&lt;p&gt;Excel has always been my go-to tool for exploring the abundance of csv-formatted data that I often find on the internet. However, every once in awhile large datasets like these &lt;a href=&quot;http://www.flcdatacenter.com/download/H1B_2010_TEXT.zip&quot;&gt;H1B salary figures&lt;/a&gt; show up&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, where the shear amount of records exceeds Excel's 65k row limit. &lt;/p&gt;
&lt;p&gt;In these scenarios I turn to sqlite and a recently discovered project, &lt;a href=&quot;http://code.google.com/p/apsw/&quot;&gt;apsw&lt;/a&gt; for my data analysis-fu. &lt;/p&gt;
&lt;h3&gt;Importing data into sqlite3&lt;/h3&gt;
&lt;p&gt;To start, I load up the data using a script I created a while back called &lt;strong&gt;csv2sql.py&lt;/strong&gt; &lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. Although you can use sqlite to directly load in csv files, I often find the functionality chokes on slightly malformed csv files. So here's my short script that reinvents the wheel a little better using Python:&lt;/p&gt;
&lt;pre&gt;
Usage: csv2sql CSVFILE

Options:
  -h, --help  show this help message and exit
  -o DBNAME   output sqlite3 database file
  -t TABLE    default table name for data import
&lt;/pre&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/bin/env python&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sys&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;csv&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;re&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;codecs&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;optparse&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;op&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sqlite3&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;db&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;itertools&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;islice&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OptionParser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;usage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;%prog CSVFILE&amp;quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;-o&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;store&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;dbname&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;data.db&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;output sqlite3 database file&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;-t&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;store&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;table&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Records&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;default table name for data import&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;codecs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;encoding&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;utf-8&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;ignore&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fh&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;csv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fh&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;slugify&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;lambda&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;r&amp;#39;\W+&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;_&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;upper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\t\t&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;,&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\t\t&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; text&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;slugify&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Creating table: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;      &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

        &lt;span class=&quot;c&quot;&gt;# create table with schema&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;CREATE TABLE &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; (&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;);&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;schema&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;     
        &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;execute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Schema&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;======&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c&quot;&gt;# insert&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Adding records to database&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;INSERT INTO &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; (&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;) VALUES (&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;);&amp;quot;&lt;/span&gt; 
        &lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;, &amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;, &amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;?&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fields&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;to_add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;islice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;sys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdout&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;.&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;executemany&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;to_add&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;islice&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;commit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fpath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opts&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h3&gt;Exploring the data with apsw&lt;/h3&gt;
&lt;p&gt;&lt;img alt=&quot;&quot; src=&quot;http://imgur.com/D47cL.png&quot; /&gt;&lt;/p&gt;
&lt;p&gt;After importing the csv file, I typically run SQL queries against the database in a sqlite shell to explore the data. &lt;a href=&quot;http://code.google.com/p/apsw/&quot;&gt;Apsw&lt;/a&gt; is a Python wrapper for sqlite3, which also includes an enhanced shell with features like tab completions and output modes for json and Python tuples. These two things combined make interactive data exploration extremely pleasant.&lt;/p&gt;
&lt;p&gt;Once you have apsw installed &lt;sup id=&quot;fnref:3&quot;&gt;&lt;a href=&quot;#fn:3&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, create a short alias in your &lt;em&gt;.bashrc&lt;/em&gt; file so that you can invoke the enhanced sqlite shell from the commandline:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;nb&quot;&gt;alias &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;sqlite&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;python -c &amp;quot;import apsw;apsw.main()&amp;quot;&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With this in place, you can invoke the shell by typing &lt;code&gt;sqlite data.db&lt;/code&gt; on your terminal prompt. Those familiar with the sqlite shell know that you can have it output the results of a query in several different formats (i.e. column, csv, line etc) . And to set the output mode you simply type &lt;code&gt;.mode &amp;lt;MODENAME&amp;gt;&lt;/code&gt; into the prompt before executing your queries.&lt;/p&gt;
&lt;p&gt;The two most useful output formats that apsw provides is the &quot;json&quot; and &quot;python&quot; modes. Here's what the following SQL query looks like after setting up &quot;json&quot; and &quot;python&quot; modes respectively.&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job_title&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;wage_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;avg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;wage_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt; 
  &lt;span class=&quot;k&quot;&gt;WHERE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job_title&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;like&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;&amp;quot;%SOFTWARE%&amp;quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;COLLATE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NOCASE&lt;/span&gt; 
  &lt;span class=&quot;k&quot;&gt;GROUP&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;BY&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job_title&lt;/span&gt; 
  &lt;span class=&quot;k&quot;&gt;ORDER&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;by&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;wage_from&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;LIMIT&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img alt=&quot;json output&quot; src=&quot;http://imgur.com/6T7kz.png&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;tuple output&quot; src=&quot;http://imgur.com/zl7jc.png&quot; /&gt;&lt;/p&gt;
&lt;p&gt;As you can see in the above example, this data is readily consumable in any standard python shell. Cut and paste the output into a python interpreter for further analysis, and use matplotlib to further visualize the data.&lt;/p&gt;
&lt;div class=&quot;footnote&quot;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&quot;fn:2&quot;&gt;
&lt;p&gt;Mentioned on &lt;a href=&quot;http://news.ycombinator.com/item?id=2444938&quot;&gt;Hacker News&lt;/a&gt;
&amp;#160;&lt;a href=&quot;#fnref:2&quot; rev=&quot;footnote&quot; title=&quot;Jump back to footnote 1 in the text&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:1&quot;&gt;
&lt;p&gt;This script assumes that the csv file's first row contains the field names for the data. This is how it automatically creates the table schema for you
&amp;#160;&lt;a href=&quot;#fnref:1&quot; rev=&quot;footnote&quot; title=&quot;Jump back to footnote 2 in the text&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn:3&quot;&gt;
&lt;p&gt;To install apsw, you'll need to make sure that the sqlite development libraries are in your library and include paths. &lt;del&gt;I finally was able to install apsw after using &lt;code&gt;brew install sqlite&lt;/code&gt; and setting the &lt;em&gt;libraries&lt;/em&gt; variable in setup.py to [&quot;/usr/local/lib&quot;]&lt;/del&gt; See Roger's comment below for a way to automatically retrieve all of the dependencies:
&amp;#160;&lt;a href=&quot;#fnref:3&quot; rev=&quot;footnote&quot; title=&quot;Jump back to footnote 3 in the text&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</content>
   
 </entry>
 
 <entry>
   <title>So that's how tracing JITs work ...</title>
   <link href="http://www.huyng.com/posts/so-thats-how-tracing-jits-work"/>
   <updated>2011-04-07T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/so-thats-how-tracing-jits-work</id>
   
   <category term="programming" />
   

   <content type="html">&lt;!----&gt;
&lt;p&gt;From PyPy's blog: &lt;a href=&quot;http://morepypy.blogspot.com/2011/04/tutorial-part-2-adding-jit.html&quot;&gt;Adding a JIT to your interpreter&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When it (the jit compiler) detects a loop of code in the target language that is executed often, the loop is considered &quot;hot&quot; and marked to be traced. The next time that loop is entered, the interpreter gets put in tracing mode where every executed instruction is logged.&lt;/p&gt;
&lt;p&gt;When the loop is finished, tracing stops. The trace of the loop is sent to an optimizer, and then to an assembler which outputs machine code. That machine code is then used for subsequent loop iterations.&lt;/p&gt;
&lt;p&gt;(The generated machine code) depends on several assumptions about the code. Therefore, the machine code will contain guards, to validate those assumptions. If a guard check fails, the runtime falls back to regular interpreted mode.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is one of the best explanation of  JIT compilers I've found. I finally understand how they work after hearing the term tossed around so much lately. &lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Sane color scheme for Matplotlib</title>
   <link href="http://www.huyng.com/posts/sane-color-scheme-for-matplotlib"/>
   <updated>2011-02-08T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/sane-color-scheme-for-matplotlib</id>
   
   <category term="python" />
   
   <category term="programming" />
   

   <content type="html">&lt;p&gt;
    John Hunter, creator of &lt;a href=&quot;http://matplotlib.sourceforge.net/&quot;&gt;MatPlotlib&lt;/a&gt;, originally designed it&amp;rsquo;s color scheme to be familiar to Matlab users. As it turns out, the color scheme works well for publication material but doesn't work so great for viewing visualizations on the web.
&lt;/p&gt;
&lt;p&gt;
    I find the default styling for graphs produced using &lt;a href=&quot;http://had.co.nz/ggplot2/&quot;&gt;ggplot2&lt;/a&gt; aesthetically pleasing for this purpose, so I spent some time over the weekend to refine the default colors and settings for my matplotlib installation. The result of this work is embodied in this &lt;a href=&quot;https://gist.github.com/816622&quot;&gt;.matplotlibrc color theme&lt;/a&gt; file. If you want graphs that look like the ones below &lt;em&gt;by default&lt;/em&gt;, download it and place the file under &lt;code&gt;~/.matplotlib/matplotlibrc&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;scatter plot&quot; src=&quot;http://imgur.com/oPvCE.png&quot; /&gt;
&lt;img alt=&quot;bar charts&quot; src=&quot;http://imgur.com/063Hs.png&quot; /&gt;
&lt;img alt=&quot;line plot&quot; src=&quot;http://imgur.com/kjVuA.png&quot; /&gt;
&lt;img alt=&quot;time series&quot; src=&quot;http://imgur.com/Idqs4.png&quot; /&gt;
&lt;img alt=&quot;histogram&quot; src=&quot;http://imgur.com/Z0GWx.png&quot; /&gt;&lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Under the hood: An HTTP request with multipart/form-data</title>
   <link href="http://www.huyng.com/posts/under-the-hood-an-http-request-with-multipartform-data"/>
   <updated>2011-02-06T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/under-the-hood-an-http-request-with-multipartform-data</id>
   
   <category term="python" />
   
   <category term="programming" />
   

   <content type="html">&lt;!----&gt;
&lt;p&gt;Check yourself, do you know what an HTTP request looks like coming over the wire? No, not the raw bits or the flowing electrons. Just try to picture the actual HTTP request body as text. Now  imagine what it looks like if the HTTP request has an image attached. Here it is. This is what the internet is built on. &lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;/
User-Agent: curl/7.21.2 (x86_64-apple-darwin)
Host: localhost:8080
Accept: */*
Content-Length: 1143
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------83ff53821b7c

------------------------------83ff53821b7c
Content-Disposition: form-data; name=&amp;quot;img&amp;quot;; filename=&amp;quot;a.png&amp;quot;
Content-Type: application/octet-stream

?PNG

IHD?wS??iCCPICC Profilex?T?kA?6n??Zk?x?&amp;quot;IY?hE?6?bk
Y?&amp;lt;ߡ)??????9Nyx?+=?Y&amp;quot;|@5-?M?S?%?@?H8??qR&amp;gt;?׋??inf???O?????b??N?????~N??&amp;gt;?!?
??V?J?p?8?da?sZHO?Ln?}&amp;amp;???wVQ?y?g????E??0
 ??
   IDAc????????-IEND?B`?
------------------------------83ff53821b7c
Content-Disposition: form-data; name=&amp;quot;foo&amp;quot;

bar
------------------------------83ff53821b7c--
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You don't always need to know what's under the hood. But sometimes, it's just magical to realize that the thing you're using everyday is nothing more than this. What other protocols do you take for granted yet find completely amazing?&lt;/p&gt;
&lt;p&gt;To watch raw HTTP requests yourself&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Get &lt;a href=&quot;https://gist.github.com/raw/814831/fb48dd96467a5e52edeff2010d53f30278926391/reflect.py&quot;&gt;reflect.py&lt;/a&gt; -- an echo server for HTTP requests&lt;/li&gt;
&lt;li&gt;Run this in terminal #1 &lt;code&gt;python reflect.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Run this in terminal #2 &lt;code&gt;curl -X POST  -F foo=bar -F img=@a.png localhost:8080&lt;/code&gt;&lt;br /&gt;
&lt;/li&gt;
&lt;/ol&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Django custom form widget for dictionary and tuple key-value pairs</title>
   <link href="http://www.huyng.com/posts/django-custom-form-widget-for-dictionary-and-tuple-key-value-pairs"/>
   <updated>2010-12-20T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/django-custom-form-widget-for-dictionary-and-tuple-key-value-pairs</id>
   
   <category term="django" />
   
   <category term="python" />
   

   <content type="html">&lt;!----&gt;
&lt;p&gt;Here's a code snippet to ease your pains when asking for user input in the form of key-value pairs of data. I've been using this widget in more and more applications as Redis and other lightweight databases displace some of my traditional SQL storage solutions. &lt;/p&gt;
&lt;p&gt;I had to create this widget because none of the builtin django widgets could generate arbitrarily long lists of key-value pairs for the user to modify &lt;em&gt;at runtime&lt;/em&gt;. It's especially useful for storing and displaying dictionary data. Here's an example of what it looks like after being rendered:&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;django custom form widget for key-value input&quot; src=&quot;http://imgur.com/9uztp.png&quot; /&gt;&lt;/p&gt;
&lt;p&gt;To use this, set the &lt;em&gt;JsonPairInputs&lt;/em&gt; as the default widget on any form field. Here's a simple example below:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;n&quot;&gt;examplejsonfield&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;forms&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CharField&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;Example JSON Key Value Field&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;required&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                   &lt;span class=&quot;n&quot;&gt;widget&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JsonPairInputs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_attrs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;size&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;35&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
                                                           &lt;span class=&quot;n&quot;&gt;key_attrs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;class&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;large&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You may also pre-populate the input boxes if you pass a json-encoded list of key-value pairs to the&lt;br /&gt;
&quot;initial&quot; argument of the above &lt;em&gt;CharField's&lt;/em&gt; form.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;widgets.py&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;simplejson&lt;/span&gt; 
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;django.forms&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Widget&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;django.utils.encoding&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;force_unicode&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;django.utils.safestring&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mark_safe&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;django.forms.widgets&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flatatt&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;JsonPairInputs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Widget&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;A widget that displays JSON Key Value Pairs&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    as a list of text input box pairs&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;    Usage (in forms.py) :&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    examplejsonfield = forms.CharField(label  = &amp;quot;Example JSON Key Value Field&amp;quot;, required = False,&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;                                       widget = JsonPairInputs(val_attrs={&amp;#39;size&amp;#39;:35},&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;                                                               key_attrs={&amp;#39;class&amp;#39;:&amp;#39;large&amp;#39;}))&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;A widget that displays JSON Key Value Pairs&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        as a list of text input box pairs&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;        kwargs:&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        key_attrs -- html attributes applied to the 1st input box pairs&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        val_attrs -- html attributes applied to the 2nd input box pairs&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;        &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key_attrs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_attrs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;key_attrs&amp;quot;&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key_attrs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;key_attrs&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;val_attrs&amp;quot;&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_attrs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;val_attrs&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Widget&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;render&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attrs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;Renders this widget into an html string&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;        args:&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        name  (str)  -- name of the field&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        value (str)  -- a json string of a two-tuple list automatically passed in by django&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        attrs (dict) -- automatically passed in by django (unused in this function)&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;{}&amp;#39;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;twotuple&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;simplejson&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;force_unicode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;&amp;#39;&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; 
            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;twotuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; 
                &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;key&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;s&quot;&gt;&amp;#39;value&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;s&quot;&gt;&amp;#39;fieldname&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;s&quot;&gt;&amp;#39;key_attrs&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flatatt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key_attrs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                       &lt;span class=&quot;s&quot;&gt;&amp;#39;val_attrs&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flatatt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val_attrs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;&amp;lt;input type=&amp;quot;text&amp;quot; name=&amp;quot;json_key[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%(fieldname)s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;]&amp;quot; value=&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%(key)s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot; &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%(key_attrs)s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;gt; &amp;lt;input type=&amp;quot;text&amp;quot; name=&amp;quot;json_value[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%(fieldname)s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;]&amp;quot; value=&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%(value)s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot; &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%(val_attrs)s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;gt;&amp;lt;br /&amp;gt;&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mark_safe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;value_from_datadict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        Returns the simplejson representation of the key-value pairs&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        sent in the POST parameters&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;        args:&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        data  (dict)  -- request.POST or request.GET parameters&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        files (list)  -- request.FILES&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;        name  (str)   -- the name of the field associated with this widget&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;        &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;has_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;json_key[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;]&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;has_key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;json_value[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;]&amp;#39;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; 
            &lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getlist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;json_key[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;]&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 
            &lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getlist&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;json_value[&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;]&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 
            &lt;span class=&quot;n&quot;&gt;twotuple&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; 
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;twotuple&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt; 
            &lt;span class=&quot;n&quot;&gt;jsontext&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;simplejson&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;twotuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;jsontext&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Gathering Leads using MTurk</title>
   <link href="http://www.huyng.com/posts/gathering-leads-using-mturk"/>
   <updated>2010-06-13T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/gathering-leads-using-mturk</id>
   
   <category term="crowdsourcing" />
   
   <category term="python" />
   
   <category term="programming" />
   

   <content type="html">&lt;!-- End Meta --&gt;
&lt;p&gt;I want to share a technique with you that I have been using to streamline my &lt;a href=&quot;http://steveblank.com/category/customer-development/&quot;&gt;customer development&lt;/a&gt; process. It's experimental, but so far it's saving me a lot of time and producing good results. &lt;/p&gt;
&lt;p&gt;At the core, my technique involves using MTurk to collect targeted leads for some business ideas that have been brewing in my head. By using MTurk I was able to gather contact information from a huge list of websites in a very little amount of time. Sure, I could've written a scraper to find emails on most of these sites, but after weighing the cost/benefit ratio, I figured it just wasn't worth my time.&lt;/p&gt;
&lt;!--more--&gt;
&lt;h3&gt;The Task&lt;/h3&gt;
&lt;p&gt;To give you some context, I had a list of around 2000 urls pointing to local chamber of commerce sites scattered across the United States. Quite a few of my business ideas involve these organizations, so I wanted to &lt;a href=&quot;http://steveblank.com/category/customer-development/&quot;&gt;validate&lt;/a&gt;/gauge interest by sending out targeted email inquiries to each organization's leader. &lt;/p&gt;
&lt;p&gt;For those of you unfamiliar with MTurk, it is a crowd-sourcing service from Amazon that allows you to submit small, well-defined tasks to thousands of people around the world. You basically construct a HTML form with a special templating syntax and then submit a CSV file that will populate into that form when Amazon displays it to their crowd of workers. &lt;/p&gt;
&lt;p&gt;You can see a sample of the task that I created below. &lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;mturk preview&quot; src=&quot;/media/3000/mturk_preview.png&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Aside: I generated the above form using &lt;a href=&quot;http://shpaml.webfactional.com/&quot;&gt;Shpaml&lt;/a&gt; which I highly recommend for writing short-hand HTML. It significantly cut down on my typing and development time.&lt;/p&gt;
&lt;h3&gt;The Results&lt;/h3&gt;
&lt;p&gt;I ran a test trial using 100 urls. The task was set up so that every url would be sent to at least 3 workers. In total, 300 turkers worked on my task. On Sunday morning I had submitted the task to Amazon Mturk. When I came back 4 hours later, it was already completed! Although there were a few spammers, the overall quality was great for what I was trying to achieve. Check out a small sample of &lt;a href=&quot;http://www.huyng.com/wp-content/uploads/mtruk_results.csv_.zip&quot;&gt;my mturk results&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Out of the batch of 300, fourty of the answers came from spammers. I know that 13 percent sounds like a lot! However, since I had 3 workers work on each URL, I quickly filtered out the spam by accepting only the answers where two or more mturkers intersected.&lt;/p&gt;

&lt;script&gt;utmx_section(&quot;emailme&quot;)&lt;/script&gt;
&lt;p&gt;For those of you trying to capture the local marketplace, or even if you're just interested in getting this data, send me a &lt;a href=&quot;/about/&quot;&gt;quick email&lt;/a&gt;! I will be running through the rest of my batch in the upcoming days.&lt;/p&gt;&lt;/noscript&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Daily Signal (May 10, 2010)</title>
   <link href="http://www.huyng.com/posts/daily-signal-may-10-2010"/>
   <updated>2010-05-10T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/daily-signal-may-10-2010</id>
   

   <content type="html">&lt;!-- End Meta --&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://code.google.com/p/pandas/&quot;&gt;Pandas - R's dataframes in Python&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Pandas&lt;/em&gt; is a python package for working with timeseries data. I've been looking a long time for an equivalent to R's dataframe functionality within python and this is it. What makes these dataframe structures so special is their ability to quickly slice and dice a table of data. &lt;br /&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;http://github.com/tmpvar/jsdom/blob/master/example/jquery/run.js&quot;&gt;node.js, jsdom and jQuery&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;JSdom allows you emulate the entire document object model of a webpage without running a browser. I just found out today that node.js now runs jsdom and jquery seamlessly. This is nothing short of a webscraping revolution.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;http://code.google.com/p/crypto-js/&quot;&gt;CryptoJS&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A cryptographic library for Javascript. Perfect use case: Interactive API documentation. Got an api service? Want developers to adopt quicker? Allow them to submit signed API calls within your webpages. &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;http://blog.asmartbear.com/marketplace-business-model.html&quot;&gt;The MarketPlace Play&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A detailed analysis of the marketplace business model.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Exceptions are your Friends</title>
   <link href="http://www.huyng.com/posts/exceptions-are-your-friends"/>
   <updated>2010-02-09T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/exceptions-are-your-friends</id>
   
   <category term="programming" />
   
   <category term="python" />
   

   <content type="html">&lt;!-- End Meta --&gt;
&lt;blockquote&gt;Robust code cries often and loudly as soon as something is not right. It does not cower away in corners of obscurity hoping that no one will notice, until one day, shit hits the fan.&lt;/blockquote&gt;
Any serious python code contains proper use of &lt;em&gt;exceptions&lt;/em&gt;, &lt;em&gt;errors&lt;/em&gt;, and &lt;em&gt;asserts&lt;/em&gt;. In fact, I would argue that their presence defines the difference between one-off throwaway scripts and robust library code.

Imagine a python interpreter without these facilities. What would the scenario look like?

&lt;!--more--&gt;

When you write to a non-existent or write-protected file, instead of the &lt;em&gt;sys&lt;/em&gt; module alerting you to the fact, you'd get .. &lt;strong&gt;nothing&lt;/strong&gt;. Not a single sign that your data is gone forever.

Exceptions are you friends. They're your canary in the coal mine. They let you know that something is wrong at the point of inception rather than pretending that everything is fine until two months later when your client pops a vein because their files were not backed-up.

Use &lt;em&gt;exceptions&lt;/em&gt; to annotate your code. Use them to communicate to your future colleagues that what they're trying to do is seriously wrong. Use them to put yourself in check before you try to modify that class attribute. &lt;em&gt;Exceptions&lt;/em&gt; keep you sane.
&lt;h3&gt;Handling Exceptions&lt;/h3&gt;
Some would argue that robust code has no &lt;em&gt;exceptions&lt;/em&gt; in it. They would be happy to throw a giant all-inclusive &lt;em&gt;try ... except&lt;/em&gt; statement around every piece of code that raises an exception. Don't do it, please. That's like ignoring a crying kid without knowing why they are crying in the first place.

Here's a simple rule of thumb for dealing with exceptions -- I guess you can call it a best-practice. Catch &lt;em&gt;specific&lt;/em&gt; exceptions and errors &lt;strong&gt;only if you know what to do with them&lt;/strong&gt;. Otherwise, let some other piece of code higher in the call stack with enough context deal with it.

Robust code cries often and loudly as soon as something is not right. It does not cower away in corners of obscurity hoping that no one will notice, until one day, shit hits the fan. At the same time, when it knows how to deal with an exception, it'll do so. But it will &lt;em&gt;never&lt;/em&gt; suppress an exception when it has no idea.

&lt;strong&gt;Exceptions&lt;/strong&gt; and &lt;strong&gt;Exception Handlers&lt;/strong&gt;. These are the two pillars of robust, high-quality, and fault-tolerant code.
&lt;h4&gt;See Also&lt;/h4&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;a href=&quot;http://www.doughellmann.com/articles/how-tos/python-exception-handling/index.html&quot;&gt;Python Exception Handling Techniques - Doug Hellman&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a href=&quot;http://docs.python.org/tutorial/errors.html&quot;&gt;Errors &amp;amp; Exceptions - Python Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Don’t Hash Your Secrets, Here’s why in Python</title>
   <link href="http://www.huyng.com/posts/dont-hash-your-secrets-heres-why-in-python"/>
   <updated>2010-02-01T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/dont-hash-your-secrets-heres-why-in-python</id>
   
   <category term="programming" />
   
   <category term="python" />
   

   <content type="html">&lt;!-- End Meta --&gt;
&lt;p&gt;Ben Adida suggests that you &lt;a href=&quot;http://benlog.com/articles/2008/06/19/dont-hash-secrets/&quot;&gt;don't hash your secrets&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;That means that if you know SHA1(secret || message), then you can compute SHA1(secret || message  || ANYTHING), which is a valid signature for message || ANYTHING. So to break this system, you just need to see one signature.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Not being a cryptography expert, I was blown away by his article. At the core of his post is the idea that given a hash digest of a &lt;strong&gt;message&lt;/strong&gt;, one could compute the hash of &lt;strong&gt;message + appended_message&lt;/strong&gt; without even knowing the original message. &lt;/p&gt;
&lt;p&gt;I had to see this for myself. Was it &lt;em&gt;that&lt;/em&gt; easy to extend an MD5 or SHA1 hash?
Below, you'll find working &lt;a href=&quot;/media/3006/spoof_md5.py.txt&quot;&gt;python code&lt;/a&gt; and an explanation for spoofing signatures signed with the MD5 algroithm. &lt;/p&gt;
&lt;!--more--&gt;
&lt;h3&gt;Implementation&lt;/h3&gt;
&lt;p&gt;To generate a hash from a message, algorithms like MD5 and SHA1 iterate through the message block by block. For each block, the algorithm runs a &lt;a href=&quot;http://en.wikipedia.org/wiki/Cryptographic_hash_function#Merkle-Damg.C3.A5rd_construction&quot;&gt;transformation function&lt;/a&gt; where the input is a &lt;strong&gt;seed state&lt;/strong&gt; and a &lt;strong&gt;message block&lt;/strong&gt; . The output of this transformation is then fed back as the &lt;strong&gt;seed state&lt;/strong&gt; for the transformation of the next message block (see the above diagram).&lt;/p&gt;

&lt;center&gt;&lt;img alt=&quot;md5.png&quot; src=&quot;/media/3006/md5.png&quot; width=&quot;100%&quot;&gt;&lt;/center&gt;
&lt;p&gt;After the hashing function has digested the entire message, it then appends some padding and runs the transformation function one more time. The &lt;strong&gt;final state&lt;/strong&gt; of this transformation becomes the digest. &lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hashlib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;md5&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;md5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;secret&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;repr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&amp;quot;O&amp;#39;Q&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\xa8\xb8\x9d\x81&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\xd7\x13&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\xe0\xfb&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;_2&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\xde&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In the code above, &lt;strong&gt;the signature represents the state output of the final transformation function&lt;/strong&gt;. &lt;/p&gt;
&lt;p&gt;AHA! We now have a strategy to extend the hash. If we can seed the transformation function with the state(AKA signature) of the original message, we can essentially extend the hash without even knowing the original message. &lt;/p&gt;
&lt;p&gt;There is one problem however. I mentioned before that the MD5 algorithm adds a piece of padding to the original message before it gives us the hash. That means whenever we see a signature it's really the hash of the &lt;strong&gt;message + padding&lt;/strong&gt;. Fortunately, the padding is only dependent upon the length of the original message. With that in mind, we can easily generate both the new signature and padding. Here's some pseudocode&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;decode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;calculate_padding&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;original_message_len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;new_signature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;transform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;appended message&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# This should be True&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;new_signature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;md5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;original_message&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;appended message&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now here is the real code.&lt;br /&gt;
&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;spoof_digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;originalDigest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;originalLen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spoofMessage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# first decode digest back into state tuples&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Decode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;originalDigest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# generate a seed md5 object&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;md5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# seed the count variable for calculation of index, padLen, and bits&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;originalLen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# calculate some variables to generate the original padding&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x3f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;padLen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;56&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;56&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;120&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Encode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xffffffffL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# construct the original padding&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PADDING&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;padLen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# augment the count with the new padding and trailing bits&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# run an update&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spoofMessage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# We now have a digest of the original secret + message + some_padding&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spoof&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;padding&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The code has a dependency on a pure-python implementation of the md5 algorithm that I've packaged it together with &lt;a href=&quot;/media/3006/spoof_md5.py.txt&quot;&gt;the source code&lt;/a&gt;. If you want to try it out, download the file and run this test function (also included in the file):&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_spoofing&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;originalMsg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;secret&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;my message&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;appendedMsg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;quot;my message extension&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# This is the signature that a legitimate user sends&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# over the wire in clear text. &lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;originalSignature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;md5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;originalMsg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# This is how an attacker would spoof the signature where,&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# the message ==  originalMsg + padbits + appendedMsg .&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# Notice that this method implies that the attacker&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# knows the original length of the &amp;quot;secret&amp;quot; ... &lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# Most apis such as Flickr assign secrets that are of&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# uniform length for all of their api users.&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;spoofSignature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;padbits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spoof_digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;originalSignature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;originalMsg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;appendedMsg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# This is how a legitimate user would construct the&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# a signature when message == originalMsg + padbits + appendedMsg&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;testSignature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;md5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;originalMsg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;padbits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;appendedMsg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# make sure the spoof signature and the test signature match.&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# if, this passes, we&amp;#39;ve successfully constructed a spoofed message&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# of the form: secret + orginal_message + padding + appended_message&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# without actually knowing the secret.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testSignature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spoofSignature&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt; Information in this blog is meant for educational purposes only! &lt;/strong&gt;&lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Quick Bash Tip : Directory Bookmarks</title>
   <link href="http://www.huyng.com/posts/quick-bash-tip-directory-bookmarks"/>
   <updated>2009-09-10T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/quick-bash-tip-directory-bookmarks</id>
   
   <category term="programming" />
   
   <category term="bash" />
   

   <content type="html">&lt;!-- End Meta --&gt;
&lt;p&gt;&lt;strong&gt;EDIT 2010-07-01&lt;/strong&gt; : I've packaged up a shell script to allow you to save and jump to commonly used directories. It's called &lt;a href=&quot;/bashmarks-directory-bookmarks-for-the-shell/&quot;&gt;bashmarks&lt;/a&gt; and it has tab completion functionality built-in. Learn more about &lt;a href=&quot;/bashmarks-directory-bookmarks-for-the-shell/&quot;&gt;bashmarks here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;&lt;a href=&quot;/bashmarks-directory-bookmarks-for-the-shell/&quot;&gt;Entirely new and improved version&lt;/a&gt;&lt;/h3&gt;
&lt;h3&gt;Do not use the stuff below&lt;/h3&gt;&lt;br&gt;

&lt;hr /&gt;
&lt;p&gt;Before I wrote this script, It felt like I spent half of my time in terminal cd-ing around to various directories. If you're like me, placing this snippet into your .bashrc file will save you tons of time each and every single day:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;# Bash Directory Bookmarks
alias m1='alias g1=&amp;quot;cd `pwd`&amp;quot;'
alias m2='alias g2=&amp;quot;cd `pwd`&amp;quot;'
alias m3='alias g3=&amp;quot;cd `pwd`&amp;quot;'
alias m4='alias g4=&amp;quot;cd `pwd`&amp;quot;'
alias m5='alias g5=&amp;quot;cd `pwd`&amp;quot;'
alias m6='alias g6=&amp;quot;cd `pwd`&amp;quot;'
alias m7='alias g7=&amp;quot;cd `pwd`&amp;quot;'
alias m8='alias g8=&amp;quot;cd `pwd`&amp;quot;'
alias m9='alias g9=&amp;quot;cd `pwd`&amp;quot;'
alias mdump='alias|grep -e &amp;quot;alias g[0-9]&amp;quot;|grep -v &amp;quot;alias m&amp;quot; &amp;gt; ~/.bookmarks'
alias lma='alias | grep -e &amp;quot;alias g[0-9]&amp;quot;|grep -v &amp;quot;alias m&amp;quot;|sed &amp;quot;s/alias //&amp;quot;'
touch ~/.bookmarks
source ~/.bookmarks&lt;/pre&gt;&lt;/div&gt;
&lt;!--more--&gt;
&lt;h4&gt;Directory Bookmark Usage&lt;/h4&gt;
&lt;p&gt;With this in place, your bash shell will have the ability to set and retrieve directory bookmarks. Let's say you're in a folder that you visit a hundreds of times per day. Run one of the &quot;m&quot; (a.k.a &lt;em&gt;mark&lt;/em&gt;) commands inside the directory to create a bookmark. Here's an example:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;# This will create a bookmark for the /var/www directory
user@host[/var/www/] : m1&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now whenever you want to &lt;em&gt;cd&lt;/em&gt; into that directory, you can run the corresponding &quot;g&quot; (a.k.a &lt;em&gt;goto mark&lt;/em&gt;) command. &lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;# This will cd into /var/www
user@host[/etc/apache2] : g1&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;In other words the &lt;em&gt;m1&lt;/em&gt; command will set the &lt;em&gt;g1&lt;/em&gt; bookmark, the &lt;em&gt;m2&lt;/em&gt; command will set the &lt;em&gt;g2&lt;/em&gt; bookmark, and so on ... If you don't want to keep track of these bookmarks in your head, you'll be glad to hear that the &quot;lma&quot; (a.k.a &quot;list marks &quot;) command can show you all of your current bookmarks like so:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;user@host[/usr/local/] : lma
g1='cd /var/www/'
g2='cd /etc/'&lt;/pre&gt;&lt;/div&gt;
&lt;h4&gt;Persisting the Bookmarks&lt;/h4&gt;
&lt;p&gt;If you want to preserve your bookmarks for the next time you log in, execute the &lt;em&gt;mdump&lt;/em&gt; command which will store the bookmarks into a file called &lt;em&gt;.bookmarks&lt;/em&gt; under your HOME directory. Keep in mind that if you do not run this command your bookmarks will be forgotten once you log out of the shell.&lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>How I Located a Camera in your Back Yard</title>
   <link href="http://www.huyng.com/posts/how-i-located-a-camera-in-your-back-yard"/>
   <updated>2009-08-29T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/how-i-located-a-camera-in-your-back-yard</id>
   
   <category term="tricks" />
   
   <category term="python" />
   

   <content type="html">&lt;!-- End Meta --&gt;
&lt;p&gt;I found a webcam in &lt;em&gt;your&lt;/em&gt; neighborhood. As I type, I see &lt;em&gt;your&lt;/em&gt; dog easing out a steady stream of its steady-stream onto your neighbor's freshly lacquered patio. Don't believe me? see the results for yourself:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.huyng.com/apps/geocams.html&quot;&gt;(Unsecure) Webcams Around the World&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;How?&lt;/h3&gt;
&lt;p&gt;It first started with this &lt;a href=&quot;http://www.google.com/search?q=inurl%3A%22viewerframe%3Fmode%3Dmotion%22&quot;&gt;link&lt;/a&gt;, And then &lt;a href='http://www.google.com/search?hl=en&amp;amp;q=intitle:&quot;wj-nt104+main&quot;'&gt;this&lt;/a&gt;, And finally &lt;a href='http://www.google.com/search?hl=en&amp;amp;q=intitle:&quot;live+view+/+-+axis&quot;'&gt;this&lt;/a&gt;. &lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;These three simple google searches reveal security cameras around the world that are ironically left unsecure for the entire internet to see.  While I used Google for illustrative purposes above, Yahoo works pretty much the same when it comes to these particular searches. More importantly, there is a nice python interface to Yahoo called &lt;a href=&quot;http://pysearch.sourceforge.net/&quot;&gt;pYsearch&lt;/a&gt;, which means that you can use it to programatically harvest urls to thousands of these cameras. Below, I use this module to locate the first 40 webcam urls on a Yahoo Search Page:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;yahoo.search&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;web&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Do the search&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;query_string&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;#39;inurl:&amp;quot;viewerframe?mode=motion&amp;quot;&amp;#39;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;web&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WebSearch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;YahooDemo&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;query_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;40&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;result_list&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse_results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Extract the urls&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;webcam_urls&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;Url&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result_list&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h3&gt;Geolocating the Cameras&lt;/h3&gt;
&lt;p&gt;By itself, this is already cool, but to make things a little more interesting, I wanted to identify the exotic locations these cameras were spying on day-in and day-out. There happens to be a wonderful web service that does geolocation on IP addresses, and they have an open API available at &lt;a href=&quot;http://ipinfodb.com/ip_location_api.php&quot;&gt;ipinfodb&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In order to use it, we must first translate the webcam URLs into IP addresses if it's not done for us already. Below is a utility function I wrote to accomplish this.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extract IP Address from URL&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;url2ip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;socket&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;urlparse&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urlparse&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urlparse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;:&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;socket&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gethostbyname&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;basename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# obtain a list of ip addresses&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ip_list&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url2ip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;webcam_urls&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After extracting the IP address from all of these urls, you're now ready to geolocate the IP addresses using &lt;a href=&quot;http://ipinfodb.com/ip_location_api.php&quot;&gt;ipinfodb&lt;/a&gt;,  The function below, given the IP address, will return the longitude, latitude and various other pieces of geographical information associated with the IP address.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Geolocate IP addresses&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ip2geo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ip_addr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;urllib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urlopen&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;xml.dom&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minidom&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# setup url and request&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;  &lt;span class=&quot;s&quot;&gt;u&amp;quot;http://ipinfodb.com/ip_query.php?ip=&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;%s&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ip_addr&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urlopen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# Convert response into a dictionary with key:value pairs&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;doc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minidom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;doc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;firstChild&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;childNodes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;firstChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; 
            &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nodeName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;firstChild&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nodeValue&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Example output&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;n&quot;&gt;In&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ip2geo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;74.125.45.100&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;Out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt; 
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;u&amp;#39;City&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;Mountain View&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;CountryCode&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;US&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;CountryName&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;United States&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;Dstoffset&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;-7.0&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;Gmtoffset&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;-8.0&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;Ip&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;74.125.45.100&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;Latitude&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;37.4192&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;Longitude&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;-122.057&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;RegionCode&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;06&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;RegionName&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;California&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;Status&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;OK&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;u&amp;#39;ZipPostalCode&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;u&amp;#39;94043&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;h3&gt;Plotting the Locations on a Google Map&lt;/h3&gt;
&lt;p&gt;Google has a fairly simple mechanism that allows developers to create custom google maps. You first load their map widget onto your webpage using some boiler-plate javascript/html:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/media/3007/google-maps-base.html.txt&quot;&gt;Google Maps Basic Setup&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;These two lines -&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;geoXml&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;GGeoXml&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;http://code.google.com/apis/kml/documentation/KML_Samples.kml&amp;quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;addOverlay&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;geoXml&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;- in the boilerplate code will then load an external &lt;a href=&quot;http://code.google.com/apis/kml/documentation/topicsinkml.html&quot;&gt;KML&lt;/a&gt; file, which is where you can put all of the anotations to the map such as placemarkers, shape boundaries and etc. Note that since &lt;em&gt;you&lt;/em&gt; will be generating your own &lt;a href=&quot;http://code.google.com/apis/kml/documentation/topicsinkml.html&quot;&gt;KML&lt;/a&gt; file for pinpointing the camera locations, you must point the boiler plate code to reflect this change.&lt;/p&gt;
&lt;p&gt;At this point we have a list of longitude/latitude coordinates obtained from geolocating IP addresses associated with webcam URLs. To plot these camera locations on a Google map, we will generate a KML file using a small utility that I wrote:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://github.com/huyng/pygmap/&quot;&gt;pygmap - A KML Generator&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It essentially gives you a pythonic interface for generating KML files like these:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sample KML - A placemark over Mountain View,CA&lt;/strong&gt;&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;cp&quot;&gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;kml&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;xmlns=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;http://www.opengis.net/kml/2.2&amp;quot;&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;&amp;lt;Placemark&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;&amp;lt;name&amp;gt;&lt;/span&gt;Simple placemark&lt;span class=&quot;nt&quot;&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;&amp;lt;description&amp;gt;&lt;/span&gt;
        A longer description for the placemark. It can include &lt;span class=&quot;nt&quot;&gt;&amp;lt;b&amp;gt;&lt;/span&gt;HTML&lt;span class=&quot;nt&quot;&gt;&amp;lt;/b&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;&amp;lt;/description&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;&amp;lt;Point&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;nt&quot;&gt;&amp;lt;coordinates&amp;gt;&lt;/span&gt;39,39,0&lt;span class=&quot;nt&quot;&gt;&amp;lt;/coordinates&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;&amp;lt;/Point&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;&amp;lt;/Placemark&amp;gt;&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;/kml&amp;gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;All we have to do now is to loop over our list of longitudes and latitudes, and plot them onto the Google Map using KML. You can see this in action from the code below: &lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;pygmap&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GMap&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ipaddr&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ip_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# Geolocate ipaddress&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;geo&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ip2geo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ipaddr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;lon&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;geo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;Longitude&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;lat&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;float&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;geo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;#39;Latitude&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_placemark&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lon&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lat&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;Camera&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;desc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;quot;this is a &amp;lt;b&amp;gt;Camera&amp;lt;/b&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Output the result KML&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;print&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;renderKML&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So there you have it, a KML file that has all of the camera locations plotted. Just make sure to point the boilerplate google-map html page to your own generated KML file and you should see results like the map at the beginning of this post:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.huyng.com/apps/geocams.html&quot;&gt;Webcams Around the World&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;Source Code&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;http://github.com/huyng/pygmap/&quot;&gt;pygmap - A KML Placemark Generator&lt;/a&gt;&lt;br /&gt;
&lt;a href=&quot;/media/3007/google-maps-base.html.txt&quot;&gt;Google Maps HTML Boilerplate&lt;/a&gt;&lt;/p&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Python logging from multiple processes</title>
   <link href="http://www.huyng.com/posts/python-logging-from-multiple-processes"/>
   <updated>2009-08-13T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/python-logging-from-multiple-processes</id>
   

   <content type="html">&lt;span class=&quot;markdownOutput&quot;&gt;
&lt;/p&gt;
&lt;p&gt;Had a rough day today. I just wanted a log. &lt;em&gt;Not&lt;/em&gt; just any log mind you, but one that could handle writes from multiple processes running at the same time. A naive me would have put python&amp;#8217;s basic log handler within each process and then watch as each of the processes crash and burn because of conflicting disk write access.&lt;/p&gt;
&lt;p&gt;But I&amp;#8217;ve learned from my past lessons - now I use the python SocketHandler for my logging needs. Here&amp;#8217;s a basic snippet below to get you started. &lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# ============== Socket Logger =============== #
import logging
import logging.handlers # you NEED this line
logger = logging.getLogger(&quot;%s_%s&quot; % (os.getpid(), sys.argv[1]) )
logger.setLevel(logging.DEBUG)
socketHandler = logging.handlers.SocketHandler('localhost',
                    logging.handlers.DEFAULT_TCP_LOGGING_PORT)
logger.addHandler(socketHandler)
# ============== Socket Logger =============== #&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In a nutshell, it gets a logger instance, sets the logging level to DEBUG, and then attaches a SocketHandler to itself. This means that whenever any piece of your code calls logger.debug(&amp;#8220;this is my log message&amp;#8221;) , it will send that message through the pipes to your server sitting on the other end which will handle writing your log messages to disk. On most machines the DEFAULT_TCP_LOGGING_PORT will be 9020 and the server will be sitting on localhost:9020 .&lt;/p&gt;
&lt;h3 id=&quot;thelogserver&quot;&gt;The Log Server&lt;/h3&gt;
&lt;p&gt;I was surprised that python didn&amp;#8217;t already have a canonical implementation of a logging server to interface &amp;#8220;SocketHandler&amp;#8221; available in the standard library. Luckily there&amp;#8217;s a pretty neat project that does just that: &lt;a href=&quot;http://code.google.com/p/python-loggingserver/&quot;&gt;python-loggingserver&lt;/a&gt;.  As a bonus it comes with a web interface to view your logs. On most systems, you can access the website by going to http://localhost:9021/ once you&amp;#8217;ve started the server.&lt;/p&gt;
&lt;p&gt;It requires the twisted networking library so if you haven&amp;#8217;t had that installed do a&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo easy_install twisted&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then svn checkout the project:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;svn checkout http://python-loggingserver.googlecode.com/svn/trunk/ logserver&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Start the server:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd logserver
twistd --pidfile=loggingserver.pid --logfile=logginserver.log --python=loggingserver.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you&amp;#8217;re ready to log. You can now use the code snippet posted at the beginning of this article or just use the prepackaged testing script to send messages to the server:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python loggingtest.py this_is_one_process&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;View the results at &lt;a href=&quot;http://localhost:9021&quot;&gt;http://localhost:9021&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Happy logging!&lt;/p&gt;


&lt;/span&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Save Your Hands</title>
   <link href="http://www.huyng.com/posts/save-your-hands"/>
   <updated>2009-07-21T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/save-your-hands</id>
   
   <category term="python" />
   

   <content type="html">&lt;span class=&quot;markdownOutput&quot;&gt;
&lt;p&gt;Here&amp;#8217;s a brief tip: &lt;strong&gt;Rebind your ctrl key to capslock&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Save yourself from the pain. The contorted positions we developers strain our hands into will eventually break them - Emacs &amp;amp; Textmate users, you know what I&amp;#8217;m talking about. Do yourself a favor and rebind your control key to capslock. You can do this under MacOSX by going to &lt;em&gt;System Preferences&lt;/em&gt; &gt; &lt;em&gt;Keyboard &amp;amp; Mouse&lt;/em&gt; &gt; &lt;em&gt;Keyboard&lt;/em&gt; &gt; &lt;em&gt;Modifier Keys&lt;/em&gt;. Change the settings to look like the following and you&amp;#8217;re set.&lt;/p&gt;
&lt;p&gt;&lt;img id=&quot;keyboardandmouse&quot; src=&quot;/media/3008/keyboardandmouse.png&quot; alt=&quot;keyboardandmouse&quot; title=&quot;&quot; /&gt;&lt;/p&gt;
&lt;/span&gt;
</content>
   
 </entry>
 
 <entry>
   <title>Franchising: Running  Multiple Sites from one Django Codebase</title>
   <link href="http://www.huyng.com/posts/franchising-running-multiple-sites-from-one-django-codebase"/>
   <updated>2009-07-14T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/franchising-running-multiple-sites-from-one-django-codebase</id>
   
   <category term="django" />
   
   <category term="python" />
   

   <content type="html">While surveying the modularity and reusability of code written with Django, I’ve observed that projects usually travel down two divergent paths of evolution. Either they start small and grow into behemoths as developers add feature-after-feature, or they stop growing after a certain point and are packaged into reusable components. Today I propose a third route that weaves between these two extremes and allows you to reuse your existing code by running multiple websites from a single codebase. In essence, I want you to &lt;strong&gt;feel like you're developing one site&lt;/strong&gt;, when in reality you're making multiple.  For lack of a better term, I’ll call this process &lt;em&gt;Franchising&lt;/em&gt; your Django project.
&lt;h3 id=&quot;whyfranchiseyourproject&quot;&gt;Why franchise your project?&lt;/h3&gt;
When my &lt;a href=&quot;http://www.pipra.org&quot;&gt;organization&lt;/a&gt; began licensing our &lt;a href=&quot;http://www.piercesdisease.org&quot;&gt;research-tracking&lt;/a&gt; software to multiple clients, each of them wanted to make minor adjustments to various aspects of our software suite.

Our initial response to this situation was to branch the original code into several different versions for each site. In this way, when a client wanted minor changes, we would make the modification to their respective code branch. This approach worked fine as long as we weren’t adding any new features to our software. The problem, however, was that we we were.

As more clients began licensing our software and as we continued  to make improvements to it, the process of syncing those changes between all the different code branches became unmanageable. Despite heroic efforts, our svn-fu was not up to par.

Franchising was our answer to this tangled mess.
&lt;h3 id=&quot;definingfranchise&quot;&gt;Defining “Franchise”&lt;/h3&gt;
The goal was to develop a system capable of sharing common components between multiple sites from a single code-base while keeping it flexible enough for us to make minor changes to each specific site. The critical question that we had to answer was, “What did we need to keep in common between all sites, and what were the things we wanted to change from site to site?” The table below is our attempt at redefining the above question in terms of components common to a Django project.
&lt;table border=&quot;0&quot;&gt;&lt;caption id=&quot;djangocomponentsinafranchisedproject&quot;&gt;Django Components in a Franchised Project&lt;/caption&gt; &lt;col&gt;&lt;/col&gt; &lt;col align=&quot;center&quot;&gt;&lt;/col&gt; &lt;col align=&quot;center&quot;&gt;&lt;/col&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Globally-Shared&lt;/th&gt;
&lt;th&gt;Site-Specific&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;database&lt;/th&gt;
&lt;td align=&quot;center&quot;&gt;&lt;/td&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;models&lt;/th&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;td align=&quot;center&quot;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;urls&lt;/th&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;views&lt;/th&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;settings&lt;/th&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;templates&lt;/th&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;td align=&quot;center&quot;&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
From this table, three important pieces of information stand out. One, most Django components - the settings, views, urls, and templates - will have both a globally shared aspect and a site-specific aspect. Two, all sites, despite their differences in templates  and views, will have a globally shared model schema. Why? because this allows us to develop our Django project as if we were running only one site. Finally, each site will have its own unique database since each site will have its own unique data.

With this in mind, a typical directory layout of a “franchised” Django project might look something like this:

&lt;strong&gt;Franchised Django Project - Directory Layout&lt;/strong&gt;&lt;br /&gt;
&lt;center&gt;&lt;img src=&quot;/media/3005/franchised-django-app.png&quot; alt=&quot;folder structure&quot; /&gt;&lt;/center&gt;

In the top level directory sits all of the globally shared settings, applications, and urls. Under that, a “site&lt;code&gt;_&lt;/code&gt;overloads” folder will contain all of the site specific components such as databases, urls, and templates.
&lt;h3 id=&quot;implementingafranchiseddjangosite&quot;&gt;Implementing a Franchised Django Site&lt;/h3&gt;
Now the trickiest part about franchising your django project is getting the globally shared components and the site-specific components working seamlessly with each other.  Inherent within this problem is the challenge of figuring out where to route users once they reach one of your sites.

Going along with our research-tracker example, imagine that a user has typed “http://research.nasa.gov/” into their browser. If it was possible to dynamically configure Django for a specific site’s settings before responding, we could essentially run many sites from a single codebase by switching between the different configurations. Luckily for us, Apache and most web-servers allow you to dynamically set environment variables depending on the url of the incoming request. Here’s how you do it within apache’s /etc/apache2/httpd.conf file

&lt;strong&gt;/etc/apache2/httpd.conf&lt;/strong&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;VirtualHost *:80&amp;gt;
        SetHandler python-program
        PythonHandler django.core.handlers.modpython
        SetEnv DJANGO_SETTINGS_MODULE settings
        SetEnv OVERLOAD_SITE nasa
        PythonPath &quot;['/var/www/data/research_tracker'] + sys.path&quot;
        ServerName research.nasa.com
&amp;lt;/VirtualHost&amp;gt;&lt;/code&gt;&lt;/pre&gt;
Using the &lt;em&gt;SetEnv&lt;/em&gt; directive, we set a variable called “OVERLOAD&lt;code&gt;_&lt;/code&gt;SITE” to “nasa”. Now, anytime someone requests a page from “http://research.nasa.com” Apache will automatically set the variable before handing off the request to Django.

By appending the following snippet to our top-level settings.py file, we can use the environment variable to dynamically load the desired site-specific settings.

&lt;strong&gt;globally-shared settings.py&lt;/strong&gt;
&lt;pre&gt;&lt;code&gt;import os
OVERLOAD_SITE = os.environ.get('OVERLOAD_SITE')
OVERLOAD_SITE_MODULE =&quot;site_overloads&quot; + &quot;.&quot; + OVERLOAD_SITE
exec &quot;from %s.settings import *&quot; % (OVERLOAD_SITE_MODULE)&lt;/code&gt;&lt;/pre&gt;
The &lt;em&gt;os.environ.get()&lt;/em&gt; function in the above code extracts the value of our desired variable, “OVERLOAD&lt;code&gt;_&lt;/code&gt;SITE”, from the operating environment. Using this value, it determines the correct import path to the site-specific settings. Finally, with the &lt;em&gt;exec()&lt;/em&gt; function, it imports the site  specific settings.

To reiterate our general approach:
1.    Use Apache to pass off requests to Django, but before doing so, set an environment variable dependent upon the request url.
2.    Read this environment variable from within the project’s globally shared settings.py and determine the desired site.
3.    From within the globally shared settings.py, import the desired site-specific settings module.
&lt;h3 id=&quot;configuringthesite-specificsettingstooverrideglobalsettings&quot;&gt;Configuring the site-specific settings to override global settings&lt;/h3&gt;
&lt;table border=&quot;0&quot;&gt;&lt;caption id=&quot;djangocomponentsinafranchisedproject&quot;&gt;Django Components in a Franchised Project&lt;/caption&gt; &lt;col&gt;&lt;/col&gt; &lt;col&gt;&lt;/col&gt; &lt;col&gt;&lt;/col&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Globally-Shared&lt;/th&gt;
&lt;th&gt;Site-Specific&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;database&lt;/th&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;models&lt;/th&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;urls&lt;/th&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;views&lt;/th&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;settings&lt;/th&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;th&gt;templates&lt;/th&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;td&gt;x&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
Earlier, we defined the boundaries between globally shared components and site-specific components in a franchised Django project(see above table). With our dynamic configuration setup in place, we can now force site-specific components such as the database, urls, views, and templates to override the default global components. The key to doing so is defining our intentions in the site-specific settings.py.  From our NASA research-tracker example, that file would be &lt;em&gt;site&lt;code&gt;_&lt;/code&gt;overloads/nasa/settings.py&lt;/em&gt;
&lt;h4 id=&quot;settingthesite-specificdatabase&quot;&gt;Setting The Site-Specific Database&lt;/h4&gt;
By adding these few lines, we can specify to Django the exact database to use for our NASA site.

&lt;strong&gt;site&lt;code&gt;_&lt;/code&gt;overloads/nasa/settings.py&lt;/strong&gt;
&lt;pre&gt;&lt;code&gt;# Overload Default database
DATABASE_NAME = &quot;site_overloads/nasa/&quot;,'database.sqlite3'&lt;/code&gt;&lt;/pre&gt;
Overriding Global Templates with Site-Specific Templates
To make sure that templates in the site&lt;code&gt;_&lt;/code&gt;overloads directory override the globally shared templates, we have to modify both the globally-shared and site-specific &lt;em&gt;settings.py&lt;/em&gt; files like so:

&lt;strong&gt;site&lt;code&gt;_&lt;/code&gt;overloads/nasa/settings.py&lt;/strong&gt;
&lt;pre&gt;&lt;code&gt;# Overload Default TEMPLATE_DIRS
SITE_TEMPLATE_DIRS = [
    &quot;site_overloads/nasa/templates&quot;,
]&lt;/code&gt;&lt;/pre&gt;
&lt;strong&gt;globally-shared settings.py&lt;/strong&gt;
&lt;pre&gt;&lt;code&gt;# Prepend a list of site-specific template directories to the TEMPLATE_DIRS
if SITE_TEMPLATE_DIRS:
    TEMPLATE_DIRS = SITE_TEMPLATE_DIRS + TEMPLATE_DIRS&lt;/code&gt;&lt;/pre&gt;
As a result of these settings, when rendering templates, files in the &lt;strong&gt;site&lt;code&gt;_&lt;/code&gt;overloads/nasa/templates&lt;/strong&gt; folder will always be chosen over files in the globally shared templates folder, even if they share the same name.
&lt;h4 id=&quot;addingsite-specificurls&quot;&gt;Adding Site-Specific URLs&lt;/h4&gt;
In the globally-shared u{urls.py} add these few lines of code to dynamically import the correct site-specific urls:

&lt;strong&gt;urls.py&lt;/strong&gt;
&lt;pre&gt;&lt;code&gt;# import site specific urls
from django.conf import settings
if hasattr(settings, &quot;OVERLOAD_SITE_MODULE&quot;):
   exec &quot;from %s import urls as site_urls&quot; % (settings.OVERLOAD_SITE_MODULE)
   urlpatterns = site_urls + urlpatterns&lt;/code&gt;&lt;/pre&gt;
With this in place you can now add url patterns as normal under your site-specific urls.py.
&lt;h3 id=&quot;testingthesetup&quot;&gt;Testing the setup&lt;/h3&gt;
After making all of those adjustments, you’ll most likely want to try out your new setup without having to startup Apache. A good way to test our solution is to manually set the environment variable before running “manage.py runserver” or “manage.py shell”. Type the following on the commandline to run a shell using the site-specific setup.  Once inside the Django shell we can poke around at the various site-specific settings and verify that they are what we expect:
&lt;pre&gt;&lt;code&gt;OVERLOAD_SITE=nasa; manage.py shell
&amp;gt; from django.conf import settings
&amp;gt; print( settings.OVERLOAD_SITE )
nasa&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&quot;summary&quot;&gt;Summary&lt;/h3&gt;
In this article, I outlined how to franchise your Django project. As our main approach, we used Apache’s ability to set environment variables to dynamically configure Django before responding to any requests. Because of this, we could dynamically route users to site-specific views, urls, templates, and databases depending on the incoming request url.

Franchising allows you to run many Django sites from a single code base. It’s ideal for those times when you want to reuse code that you have written for one site on another project. There is no need to maintain several branches of slightly different code, and as a result, improvements made on one site will simultaneously apply to all sites.
</content>
   
 </entry>
 
 <entry>
   <title>The CRM114 Discrimnator - Your own personal secretary on Mac OSX</title>
   <link href="http://www.huyng.com/posts/the-crm114-discrimnator-your-own-personal-secretary-on-mac-osx"/>
   <updated>2009-03-14T00:00:00-07:00</updated>
   <id>http://www.huyng.com/posts/the-crm114-discrimnator-your-own-personal-secretary-on-mac-osx</id>
   
   <category term="machinelearning" />
   

   <content type="html">Imagine your boss comes in one day and says to you, &quot;We have over 100,000 web pages on our site. Of that figure, 10,000 are from spammers. I need you to go through our list of websites and figure out which ones are spam and which are genuine.&quot;

How do you accomplish this task without going crazy? Wouldn't it be great if your computer &lt;em&gt;just&lt;/em&gt; told you whether a webpage was spam or not?

Well, &lt;em&gt;it can&lt;/em&gt;. Just give it some initial training and you'll have your own digital secretary in no time. This is all possible through the &lt;a href=&quot;http://crm114.sourceforge.net/&quot;&gt;CRM114 discriminator&lt;/a&gt;, which is a machine-learning tool to help you classify data according to predetermined samples.

We can use it in our case by first, feeding it documents that are known to be &quot;spam&quot;;  then feeding it documents that are known to be &quot;genuine&quot;. In these two steps, we are &quot;training&quot; the program to recognize the difference between spam and genuine webpages.

Finally,  for any unknown document, we'll run it through CRM114's &quot;classify&quot; function, which will guess the probability that the document belongs to either the &quot;spam&quot; or &quot;genuine&quot; group based on past training data.
&lt;h5&gt;Trying it out&lt;/h5&gt;
Take a look at some sample code below. It uses &lt;a href=&quot;http://www.elegantchaos.com/node/129&quot;&gt;Sam Dean's&lt;/a&gt; wrapper &lt;a href=&quot;http://source.elegantchaos.com/projects/com/elegantchaos/libraries/python/crm.py&quot;&gt;library&lt;/a&gt;, which provides an easy-to-use Python interface to the CRM114 Discriminator .
&lt;pre&gt;import crm
c = crm.Classifier(&quot;/Users/iamthecheese/Desktop/crm_test_data&quot;, [&quot;genuine&quot;, &quot;spam&quot;])
c.learn(&quot;genuine&quot;, &quot;did you see that jean claude van dam movie?&quot;)
c.learn(&quot;spam&quot;, &quot;Jean claude van dam uses viagra, you should too, here's how...&quot;)
c.classify(&quot;I went to see that movie about the dam today&quot;)&lt;/pre&gt;
If you type that into the Python interactive command prompt and all goes well, you should see the last command return to you:
&lt;pre&gt;('genuine', 0.65529999999999999)&lt;/pre&gt;
Which basically means that based on the set of training data given to CRM114, the phrase &quot;I went to see that movie about the dam today&quot; has a 65% chance of being genuine. Pretty cool huh?

Applying this to our spam problem, just find 20 pages that you know are spam and 20 pages you know are genuine; train the CRM114 with this set of data, and unleash it on the rest of the your 999,960 pages. It'll save you a lot of time and you can use your &quot;personal classification secretary&quot; for bunches of other problems in the future as well.
&lt;h5&gt;Installing the CRM114 Discriminator on Mac OSX&lt;/h5&gt;
So now that  you're hooked, lets get to installing this program on your Mac OSX. Unfortunately, there is no current macport for the CRM114 Discriminator, so you'll have to do some digging through Makefiles to get everything working. Here's how to build and install the program from the source.
&lt;ol&gt;
	&lt;li&gt;First off, install a dependent regex library called &lt;a href=&quot;http://www.laurikari.net/tre/&quot;&gt;Tre&lt;/a&gt; using macports
&lt;pre&gt;sudo port install tre&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Get the source code for CRM114
&lt;pre&gt;cd ~/Desktop/some_folder
wget http://crm114.sourceforge.net/src/&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Modfiy the &quot;Makefile&quot; under the src directory by replacing the following line:
&lt;pre&gt;prefix?=/usr     should become --&amp;gt;    prefix?=/opt/local&lt;/pre&gt;
commenting out the following line:
&lt;pre&gt;LDFLAGS += -static -static-libgcc&lt;/pre&gt;
and uncommenting the following lines:
&lt;pre&gt;CFLAGS += -I/opt/local/include -I${HOME}/include
LDFLAGS += -L/opt/local/lib -L${HOME}/lib
LIBS += -lintl -liconv&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Now save the Makefile and run the make and make install commands in the src directory
&lt;pre&gt;make &amp;amp;&amp;amp; make install&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Congratulations, now you've got the CRM114 Discriminator installed on your computer! If it's done correctly, you should be able to run the following command in terminal to get the current version
&lt;pre&gt;crm -v&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Finally to use the above sample code, go download &lt;a href=&quot;http://www.elegantchaos.com/node/129&quot;&gt;Sam Dean's&lt;/a&gt; Python CRM114 wrapper &lt;a href=&quot;http://source.elegantchaos.com/projects/com/elegantchaos/libraries/python/crm.py&quot;&gt;library&lt;/a&gt; and put it in a place where you can import it from python.  ( The site's login/password is &quot;guest&quot;).&lt;/li&gt;
&lt;/ol&gt;
This piece of software uses many cool machine-learning classification techniques which are beyond my ability to explain here. If you're interested, you can read more about the algorithms below:
&lt;ul&gt;
	&lt;li&gt;&lt;a class=&quot;urlextern&quot; title=&quot;http://www.nist.gov/dads/HTML/hiddenMarkovModel.html&quot; rel=&quot;nofollow&quot; href=&quot;http://www.nist.gov/dads/HTML/hiddenMarkovModel.html&quot; target=&quot;from_crm114_wiki_exlink&quot;&gt;Hidden Markov Model&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;urlextern&quot; title=&quot;http://www.cs.ubc.ca/~murphyk/Bayes/bayesrule.html&quot; rel=&quot;nofollow&quot; href=&quot;http://www.cs.ubc.ca/%7Emurphyk/Bayes/bayesrule.html&quot; target=&quot;from_crm114_wiki_exlink&quot;&gt;Bayesian Chain Rule&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;interwiki iw_this&quot; title=&quot;http://crm114.sourceforge.net/wiki/../docs/CRM114_paper.html&quot; href=&quot;http://crm114.sourceforge.net/docs/CRM114_paper.html&quot;&gt;Orthogonal Sparse Bigrams&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;urlextern&quot; title=&quot;http://www.siefkes.net/papers/winnow-spam.pdf&quot; rel=&quot;nofollow&quot; href=&quot;http://www.siefkes.net/papers/winnow-spam.pdf&quot; target=&quot;from_crm114_wiki_exlink&quot;&gt;Winnow&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;interwiki iw_this&quot; title=&quot;http://crm114.sourceforge.net/wiki/../docs/classify_details.txt&quot; href=&quot;http://crm114.sourceforge.net/docs/classify_details.txt&quot;&gt;Correlation&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;interwiki iw_this&quot; title=&quot;http://crm114.sourceforge.net/wiki/../docs/KNN_Hyperspace_Filters.djvu&quot; href=&quot;http://crm114.sourceforge.net/docs/KNN_Hyperspace_Filters.djvu&quot;&gt;KNN/Hyperspace&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;wikilink1&quot; title=&quot;bit_entropy&quot; href=&quot;http://crm114.sourceforge.net/wiki/doku.php?id=bit_entropy&quot;&gt;Bit Entropy&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;interwiki iw_this&quot; title=&quot;http://crm114.sourceforge.net/wiki/../docs/classify_details.txt&quot; href=&quot;http://crm114.sourceforge.net/docs/classify_details.txt&quot;&gt;CLUMP&lt;/a&gt;&lt;/li&gt;
	&lt;li&gt;&lt;a class=&quot;interwiki iw_this&quot; title=&quot;http://crm114.sourceforge.net/wiki/../docs/classify_details.txt&quot; href=&quot;http://crm114.sourceforge.net/docs/classify_details.txt&quot;&gt;SVM&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content>
   
 </entry>
 
 <entry>
   <title>A Processing.js example ( Tears in Darkness )</title>
   <link href="http://www.huyng.com/posts/a-processingjs-example"/>
   <updated>2009-03-05T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/a-processingjs-example</id>
   
   <category term="visualization" />
   
   <category term="javascript" />
   
   <category term="programming" />
   

   <content type="html">&lt;p style=&quot;text-align: center; width: 100%;&quot;&gt;
&lt;script type=&quot;application/processing&quot;&gt;
    Tear[] tears = new Tears[];                   // Create new array to keep track of all the tears
    int num_tears = 0;                            // Keeps track of the current number of tears

    // This function sets up the canvas elment
    void setup() {
      size(500, 300);                             // size( width, height ) - sets the canvas size
      background(0);                              // background ( lightness ) or background ( red, green, blue, alpha ) - sets background color
    }

    // This function is automatically called by processing every frame
    void draw() {                                                   
      fill(0, 6);                                 // fill( lightness, alpha ) or fill( red, green, blue, alpha) -  sets fill colors of any shapes drawn hereafter 
      rect(0, 0, width, height);                  // rect ( upper_leftX, upper_leftY, lower_rightX, lower_rightY ) - draws a rectangle, 
      noStroke();                                 //  removes border on shapes drawn after this point

      for( int i=0; i &lt; num_tears; i++){
         if ( tears[i] == null) {
            // Do nothing if current object == null
         }
         else if (tears[i].update() == false){
            // Update the current tear, if return false than set the current object to null
            tears[i] = null;
         }
         else{
            // When tear is not null, and update() wasn't false then draw() it
            tears[i].draw();
         }
      }

      if(mousePressed){
         // Add a new tear to the tears array at the current mouse location
         tears[num_tears] = new Tear(mouseX + window.scrollX,mouseY + window.scrollY);
         num_tears +=1;
      }

    }
    
    // This class object represents one tear. A global variable called &quot;tears&quot; stores an array of these individual tears
    class Tear{
       int x,y, alpha;
       float size;
       
       // This function is called when a new tear is created from the &quot;new Tear( x, y)&quot; command. You can pass it the x &amp;amp; y position of the new tear
       Tear(int xin, int yin){
          x = xin;                                    // sets the x coordinate of the tear
          y = yin;                                    // sets the y coordinate of the tear      
          alpha = 255;                                // sets the initial alpha (AKA opacity) of the tear         
          size = 1;                                   // sets original tear radius to be 1px
       }
       
       // This function is called to update the tear's attributes
       boolean update(){
          y += 1;                            // moves the tear down 1px
          alpha -= 5;                        // decrease tear's opacity by 5
          size += .5;                        // increase the tear's size by .5px
          if( y &gt; height || alpha &lt; 0){      // if tear's current height or opacity is out of the visible  range return false
             return false;
          }
          return true;                       // return true if everything went fine
       }
       
       // draw this tear with its current properties
       void draw(){
          fill(0,60, 100+random(0,100), alpha);        // fill ( red, green, blue, alpha )  - sets fill to these attributes, values between 0 and 255
          ellipse(x,y, size,size);                     // draws a circle
       }

    }
&lt;/script&gt;
&lt;b&gt;Click in the canvas&lt;/b&gt;
&lt;canvas style=&quot;width:520px; height: 300px;&quot;&gt;&lt;/canvas&gt;&lt;br&gt;
The processing.js visualization code

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;javascript&quot;&gt;&lt;span class=&quot;nx&quot;&gt;Tear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;tears&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[];&lt;/span&gt;                   &lt;span class=&quot;c1&quot;&gt;// Create new array to keep track of all the tears&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;num_tears&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                            &lt;span class=&quot;c1&quot;&gt;// Keeps track of the current number of tears&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// This function sets up the canvas elment&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;setup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;500&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;300&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;                             &lt;span class=&quot;c1&quot;&gt;// size( width, height ) - sets the canvas size&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;                              &lt;span class=&quot;c1&quot;&gt;// background ( lightness ) or background ( red, green, blue, alpha ) - sets background color&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// This function is automatically called by processing every frame&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;draw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;                                                   
  &lt;span class=&quot;nx&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;                                 &lt;span class=&quot;c1&quot;&gt;// fill( lightness, alpha ) or fill( red, green, blue, alpha) -  sets fill colors of any shapes drawn hereafter &lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;rect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;                  &lt;span class=&quot;c1&quot;&gt;// rect ( upper_leftX, upper_leftY, lower_rightX, lower_rightY ) - draws a rectangle, &lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;noStroke&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;                                 &lt;span class=&quot;c1&quot;&gt;//  removes border on shapes drawn after this point&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;num_tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;
     &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// Do nothing if current object == null&lt;/span&gt;
     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
     &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// Update the current tear, if return false than set the current object to null&lt;/span&gt;
        &lt;span class=&quot;nx&quot;&gt;tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
     &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// When tear is not null, and update() wasn&amp;#39;t false then draw() it&lt;/span&gt;
        &lt;span class=&quot;nx&quot;&gt;tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;draw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mousePressed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;
     &lt;span class=&quot;c1&quot;&gt;// Add a new tear to the tears array at the current mouse location&lt;/span&gt;
     &lt;span class=&quot;nx&quot;&gt;tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;num_tears&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Tear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mouseX&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scrollX&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mouseY&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;window&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scrollY&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
     &lt;span class=&quot;nx&quot;&gt;num_tears&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// This class object represents one tear. A global variable called &amp;quot;tears&amp;quot; stores an array of these individual tears&lt;/span&gt;
&lt;span class=&quot;kr&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;Tear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
   &lt;span class=&quot;kr&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
   &lt;span class=&quot;kr&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
   
   &lt;span class=&quot;c1&quot;&gt;// This function is called when a new tear is created from the &amp;quot;new Tear( x, y)&amp;quot; command. You can pass it the x &amp;amp;amp; y position of the new tear&lt;/span&gt;
   &lt;span class=&quot;nx&quot;&gt;Tear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kr&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;xin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kr&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;yin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;xin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                                    &lt;span class=&quot;c1&quot;&gt;// sets the x coordinate of the tear&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;yin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                                    &lt;span class=&quot;c1&quot;&gt;// sets the y coordinate of the tear      &lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;alpha&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;255&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                                &lt;span class=&quot;c1&quot;&gt;// sets the initial alpha (AKA opacity) of the tear         &lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                                   &lt;span class=&quot;c1&quot;&gt;// sets original tear radius to be 1px&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
   
   &lt;span class=&quot;c1&quot;&gt;// This function is called to update the tear&amp;#39;s attributes&lt;/span&gt;
   &lt;span class=&quot;kr&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(){&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                            &lt;span class=&quot;c1&quot;&gt;// moves the tear down 1px&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;alpha&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                        &lt;span class=&quot;c1&quot;&gt;// decrease tear&amp;#39;s opacity by 5&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                        &lt;span class=&quot;c1&quot;&gt;// increase the tear&amp;#39;s size by .5px&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;height&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;alpha&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;      &lt;span class=&quot;c1&quot;&gt;// if tear&amp;#39;s current height or opacity is out of the visible  range return false&lt;/span&gt;
         &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;                       &lt;span class=&quot;c1&quot;&gt;// return true if everything went fine&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
   
   &lt;span class=&quot;c1&quot;&gt;// draw this tear with its current properties&lt;/span&gt;
   &lt;span class=&quot;k&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;draw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(){&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;random&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;        &lt;span class=&quot;c1&quot;&gt;// fill ( red, green, blue, alpha )  - sets fill to these attributes, values between 0 and 255&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;ellipse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;                     &lt;span class=&quot;c1&quot;&gt;// draws a circle&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;&lt;a href=&quot;http://ejohn.org/blog/processingjs/&quot;&gt;Processing.js,&lt;/a&gt; makes coding graphics for the HTML &amp;lt;canvas&amp;gt; element a dream come true. &lt;a href=&quot;http://reas.com/&quot;&gt;Casey Reas&lt;/a&gt; and &lt;a href=&quot;http://benfry.com/&quot;&gt;Ben Fry&lt;/a&gt; created the &lt;a href=&quot;http://processing.org&quot;&gt;original&lt;/a&gt; library/programming environment to manipulate graphics using Java and has built up a vibrant community with their work. The 2nd revolution however, comes  with &lt;a href=&quot;http://ejohn.org&quot;&gt;John Resig's&lt;/a&gt; Javascript rendition of those same programming libraries. Because of his port, you can now do things with open technologies such as javascript &amp;amp; html that were once only possible with Flash. I've been exploring with it and this is what I've learned so far.
&lt;/p&gt;        
&lt;h5&gt;How was the visualization above created?&lt;/h5&gt;
&lt;p&gt; Excuse the emo title for the graphic above, but I was going for something simple so that I could quickly learn the system. Animating a tear quickly came to mind as an easy beginner's project. So to begin, let us start with an understanding of what a primitive animation is: &lt;em&gt;A series of pictures rapidly displayed one after another&lt;/em&gt;. With Processing.js this notion is realized with the two functions &lt;strong&gt;setup()&lt;/strong&gt; and &lt;strong&gt;draw()&lt;/strong&gt;. You define &lt;strong&gt;setup()&lt;/strong&gt; with the knowledge in mind that Processing.js will automatically call it at the beginning of your visualization to draw the first &quot;picture&quot; of the scene. Following this. it will repeatedly call the &lt;strong&gt;draw()&lt;/strong&gt; function several times per second to generate all of the subsequent &quot;pictures&quot; of your scene. &lt;/p&gt;
&lt;b&gt;The snippet below sets up your first &quot;picture&quot; to be 600 pixels wide and 300 pixels high. &lt;/b&gt;
&lt;pre lang=&quot;java&quot;&gt;
    // This function sets up the canvas elment
    void setup() {
      size(600, 300);                              // size( width, height ) - sets the canvas size
      background(0);                               // background ( lightness ) or background ( red, green, blue, alpha ) - sets background color
    }
&lt;/pre&gt;
&lt;h5&gt;Creating a tear&lt;/h5&gt;
&lt;p&gt;
 In my world, a tear basically consists of a circle that grows as it falls. It gets to a certain size and then it stops and fades away. That is a tear. To represent that, I create an class to store its current position, size, and various other attributes:
 &lt;/p&gt;

&lt;pre lang=&quot;java&quot;&gt;   // This class object represents one tear. A global variable called &quot;tears&quot; stores an array of these individual tears
    class Tear{
       int x,y, alpha;
       float size;
       
       // This function is called when a new tear is created from the &quot;new Tear( x, y)&quot; command. You can pass it the x &amp; y position of the new tear
       Tear(int xin, int yin){
          x = xin;                               // sets the x coordinate of the tear
          y = yin;                               // sets the y coordinate of the tear      
          alpha = 255;                           // sets the initial alpha (AKA opacity) of the tear         
          size = 1;                              // sets original tear radius to be 1px
       }
       
       // This function is called to update the tear's attributes
       boolean update(){
          y += 1;                                               // moves the tear down 1px
          alpha -= 5;                                           // decrease tear's opacity by 5
          size += .5;                                           // increase the tear's size by .5px
          if( y &gt; height || alpha &lt; 0){                         // if tear's current height or opacity is out of the visible  range return false
             return false;
          }
          return true;                                          // return true if everything went fine
       }
       
       // draw this tear with its current properties
       void draw(){
          fill(0,60, 100+random(0,100), alpha);        // fill ( red, green, blue, alpha )  - sets fill to these attributes, values between 0 and 255
          ellipse(x,y, size,size);                     // draws a circle
       }

    }&lt;/pre&gt;
&lt;p&gt;
The two major things to notice are 1) the &lt;strong&gt;Tear.update()&lt;/strong&gt; function which changes the tear's properties according to predefined rules and 2) the &lt;strong&gt;Tear.draw()&lt;/strong&gt; function which reads in the current tear's property and does the actual drawing of the tear.&lt;/p&gt; 
&lt;h5&gt;Dealing with many tears&lt;/h5&gt;
&lt;p&gt; Since our scene will have many tears at once, I'll set up an array of tears to keep track of them all&lt;/p&gt;
&lt;pre &gt;
    Tear[] tears = new Tears[];                    // Create new array to keep track of all the tears
&lt;/pre&gt;
&lt;h5&gt;Running the draw() loop&lt;/h5&gt;
&lt;p&gt;
Now that you've got the basic data structures down, we can stick them into the &lt;strong&gt;draw()&lt;/strong&gt; loop to run them. The code below basically loops through all current tears, and updates and draws them.
&lt;pre lang=&quot;java&quot;&gt;
// This function is automatically called by processing every frame
    void draw() {                                                   
      fill(0, 6);                                  // fill( lightness, alpha ) or fill( red, green, blue, alpha) -  sets fill colors of any shapes drawn hereafter 
      rect(0, 0, width, height);                   // rect ( upper_leftX, upper_leftY, lower_rightX, lower_rightY ) - draws a rectangle, 
      noStroke();                                  //  removes border on shapes drawn after this point

      for( int i=0; i &lt; num_tears; i++){
         if ( tears[i] == null) {
            // Do nothing if current object == null
         }
         else if (tears[i].update() == false){
            // Update the current tear, if return false than set the current object to null
            tears[i] = null;
         }
         else{
            // When tear is not null, and update() wasn't false then draw() it
            tears[i].draw();
         }
      }
&lt;/pre&gt; 
&lt;h5&gt; mousePressed interactivity &lt;/h5&gt;
&lt;p&gt; finally I create a little interactivity by listening for the mousePressed event. If someone clicks on the canvas, it will generate another tear object at that mouse location. You'll notice I've augmented the mouseX and mouseY variables to also include how much the user has scrolled the window. The mouseX and mouseY positions are relative to the original position of the canvas element and does not reflect changes to its positions when scrolling.&lt;/p&gt;
&lt;pre lang=&quot;java&quot;&gt;
      if(mousePressed){
         // Add a new tear to the tears array at the current mouse location
         tears[num_tears] = new Tear(mouseX + window.scrollX ,mouseY+window.scrollY);
         num_tears +=1;
      }

    }
&lt;/pre&gt;
&lt;h5&gt;How did you get that nice fading effect?&lt;/h5&gt;
&lt;p&gt;The fading effect of the tears were created by painting a semi-transparent rectangle over the canvas each frame. With each frame these rectangle are painted and eventually cover up the tears. &lt;/p&gt;
&lt;pre lang=&quot;java&quot;&gt;
 void draw() {                                                   
      fill(0, 6);                                  // fill( lightness, alpha ) or fill( red, green, blue, alpha) -  sets fill colors of any shapes drawn hereafter 
      rect(0, 0, width, height);                   // rect ( upper_leftX, upper_leftY, lower_rightX, lower_rightY ) - draws a rectangle, 
&lt;/pre&gt;
 
&lt;h5&gt;Summary&lt;/h5&gt;
&lt;p&gt; In summary, this animation was built by setting up a Tear class which held attributes like the tear's position. The Tear class had functions to update() its position, and to draw() it. A Tear was generated every time someone clicked on the canvas screen and was stored in an array of Tears. Every time processing.js calls draw(), a loop is executed to go through the list of existing tears to update() their properties, and then to draw() them. Overall, this is fun little system that I'll be using in the future to generate some cool infographics&lt;/p&gt; 
</content>
   
 </entry>
 
 <entry>
   <title>Stock photos in the public domain</title>
   <link href="http://www.huyng.com/posts/public-domains"/>
   <updated>2009-03-04T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/public-domains</id>
   

   <content type="html">&lt;strong&gt;Objective: &lt;/strong&gt;
To find a decent sized collection of stock photos that are in the public domain or have expired copyrights.

&lt;strong&gt;Results:&lt;/strong&gt;
&lt;a href=&quot;http://www.everystockphoto.com/&quot;&gt;EveryStockPhoto Search Engine&lt;/a&gt; - Nice interface, giant collection of public domain stock photos
&lt;a href=&quot;http://www.public-domain-photos.com/&quot;&gt;Public Domain Photos .com&lt;/a&gt; - well organized collection of public domain photos.
&lt;a href=&quot;http://en.wikipedia.org/wiki/Public_domain_image_resources&quot;&gt;Wikipedia Public Domain Photos&lt;/a&gt; - contains a wide collection of public domain photos.
&lt;a href=&quot;http://images.fws.gov/&quot;&gt;National Fish &amp;amp; Wildlife Service&lt;/a&gt; - contains lots of nature photos
&lt;a href=&quot;http://hubblesite.org/gallery/&quot;&gt;HubbleSite&lt;/a&gt; - space photos from NASA
&lt;a href=&quot;http://gimp-savvy.com/PHOTO-ARCHIVE/&quot;&gt;Gimp-Savvy&lt;/a&gt; - aggregation of photos from many government public domain photos
</content>
   
 </entry>
 
 <entry>
   <title>Writing Simple, Effective Documentation</title>
   <link href="http://www.huyng.com/posts/writing-simple-effective-documentation"/>
   <updated>2009-02-06T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/writing-simple-effective-documentation</id>
   
   <category term="programming" />
   

   <content type="html">Documentation is not hard though there is a temptation to make it so.  In this post, we’ll go over the details of generating documentation that can be exported to a beautifully typeset PDF and HTML from one source file.

In case you're interested, here are the results:&lt;strong&gt; &lt;/strong&gt;&lt;a href=&quot;/media/3004/article.markdown&quot;&gt;Markdown&lt;/a&gt;, &lt;a href=&quot;/media/3004/article.pdf&quot;&gt;PDF&lt;/a&gt;,  &lt;a href=&quot;/media/3004/article.html&quot;&gt;XHTML&lt;/a&gt;&lt;strong&gt;.
&lt;/strong&gt;

Because we'll want to produce the final documents in multiple formats, whatever we write will have to be formated in a simple “meta” syntax that can eventually be converted. For a brief moment along my personal documentation journey, I was led astray by xml, docbook, toolchains, and a dizzying myriad of other meta format options.
&lt;h5&gt;Salvation By MultiMarkdown&lt;/h5&gt;
Luckily for me, salvation came and it arrived in the form of &lt;a href=&quot;http://fletcherpenney.net/multimarkdown/&quot;&gt;MultiMarkdown&lt;/a&gt;, a simple  syntax to markup your text without the unnecessary bloat of xml. With this and a combination of other tools, you write your documentation once and export it to the multitude of supported formats.

The diagram below provides an overview of the process for taking your text and transforming it into PDF and HTML. As you can see, I did end up using a small bit of XML technology (specifically XSLTs) to generate a table of contents from my MultiMarkdown XHTML output.
&lt;p style=&quot;text-align: center;&quot;&gt;&lt;img class=&quot;size-full wp-image-113 aligncenter&quot; title=&quot;process&quot; src=&quot;/media/3004/process.png&quot; alt=&quot;process&quot; /&gt;&lt;/p&gt;
To get started, all you need is something to convert your MultiMarkdown documents into the desired format.  For this task you can either download &lt;a href=&quot;http://johnmacfarlane.net/pandoc/&quot;&gt;pandoc&lt;/a&gt;, or the &lt;a href=&quot;http://files.fletcherpenney.net/MultiMarkdown.zip&quot;&gt;conversion scripts&lt;/a&gt; from the creator of MultiMarkown’s &lt;a href=&quot;http://fletcherpenney.net/multimarkdown/&quot;&gt;site&lt;/a&gt;. Although &lt;a href=&quot;http://johnmacfarlane.net/pandoc/&quot;&gt;pandoc&lt;/a&gt; looks like a promising tool, I personally had trouble installing it on my computer because of issues with macports &amp;amp; ghc.
&lt;h5&gt;Typing the Documentation&lt;/h5&gt;
One of MultiMarkdown’s best feature is its simple-to-learn syntax. It’s so intuitive, you don’t feel like you’re learning anything new at all. Below are a few commands to help you get started. For further help in TextMate, use the Control+H key combo to pull up the built in cheatsheet.
&lt;pre&gt;# This is an H1 tag
## This is an H2 tag
### This is an H3 tag
... And so on
[This is a Link](http:\\www.google.com)
![This is an Image](my_image.jpg)
CSS: style.css&lt;/pre&gt;
The last command, “CSS: style.css” tells the MultiMarkdown converter to link up style.css to the converted html file. I like to include it at the very top of any markdown document so that I can later stylize it with my own custom CSS.
&lt;h5&gt;Converting To Stylized HTML&lt;/h5&gt;
Converting your MultiMarkdown to html is a simple process:
&lt;ol&gt;
	&lt;li&gt;Run multimarkdown2XHTML.pl provided in the “bin” directory of the MultiMarkdown &lt;a href=&quot;http://files.fletcherpenney.net/MultiMarkdown.zip&quot;&gt;distribution&lt;/a&gt; to transform your text.
&lt;pre&gt;multimarkdown2XHTML.pl file.markdown &amp;gt; file.html&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Add a table of contents using xsltproc and xhtml-toc-h2.xslt
&lt;pre&gt;&lt;code&gt; xsltproc xhtml-toc-h2.xslt file.html &amp;gt; file_with_toc.html&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Stylize your final HTML with some nice CSS. &lt;a href=&quot;/media/3004/documentation.css&quot;&gt;Here’s the CSS that I’ve used for my own documentation&lt;/a&gt;. It’s a modification on the oh-so-beautiful FreeBSD documentation theme.&lt;/li&gt;
&lt;/ol&gt;
&lt;h5 id=&quot;convertingtoabeautifullytypesetpdf&quot;&gt;Converting to a Beautifully Typeset PDF&lt;/h5&gt;
Converting to PDF basically follows the same path, just with some different tools.
&lt;ol&gt;
	&lt;li&gt;Run multimarkdown2latex.pl provided in the “bin” directory of the MultiMarkdown &lt;a href=&quot;http://files.fletcherpenney.net/MultiMarkdown.zip&quot;&gt;distribution&lt;/a&gt; to transform your text into LaTex.
&lt;pre&gt;multimarkdown2latex.pl file.markdown &amp;gt; file.tex&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Add some LaTex &lt;a href=&quot;http://www.artofproblemsolving.com/LaTeX/AoPS_L_GuideLay.php#start&quot;&gt;formatting options&lt;/a&gt; if needed. For example, I added the following options to prevent indentation at the beginnings of paragraphs.
&lt;pre&gt;\setlength{\parindent}{0em}&lt;/pre&gt;
&lt;/li&gt;
	&lt;li&gt;Use &lt;a href=&quot;http://www.tug.org/applications/pdftex/&quot;&gt;pdflatex&lt;/a&gt; to turn the output from the previous step into a PDF.
&lt;pre&gt;pdflatex file.tex&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
If you’re lucky enough to have TextMate, all of these conversion scripts  mentioned above  are prepackaged in the “MarkDown” bundle for you.
&lt;p style=&quot;text-align: center;&quot;&gt;&lt;img class=&quot;size-full wp-image-116 aligncenter&quot; title=&quot;textmate_snapshot&quot; src=&quot;/media/3004/textmate_snapshot.png&quot; alt=&quot;textmate_snapshot&quot; /&gt;&lt;/p&gt;

&lt;h5 id=&quot;finalthoughts&quot;&gt;Final Thoughts&lt;/h5&gt;
So there you have it. With MultiMarkdown, you write one document and transform it into PDF and HTML with some basic tools. Overall, the format is concise, easily readable in raw format, and easy to maintain since it’s just plain-ol text. It is versatile, and has a mature set of tools to work with it.  I’ve come to love it just as the creator of &lt;a href=&quot;http://blog.macromates.com/2005/textmate-manual/&quot;&gt;TextMate&lt;/a&gt;, and many others have. For small to medium sized projects, it’s the perfect documentation tool.

&lt;strong&gt;If you’re wondering what the final results look like, I wrote this article in markdown. Here is a sample of the &lt;a href=&quot;/media/3004/article.markdown&quot;&gt;Markdown&lt;/a&gt;, &lt;a href=&quot;/media/3004/article.pdf&quot;&gt;PDF&lt;/a&gt;, and &lt;a href=&quot;/media/3004/article.html&quot;&gt;XHTML&lt;/a&gt; that I generated.&lt;/strong&gt;
&lt;h5 id=&quot;resourcesmentioned&quot;&gt;See Also&lt;/h5&gt;
&lt;ul&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;http://jacobian.org/writing/great-documentation/what-to-write/&quot;&gt;Writing Great Documentation - What to Write&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;http://fletcherpenney.net/multimarkdown/&quot;&gt;MultiMarkdown’s Main Site&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;http://files.fletcherpenney.net/MultiMarkdown.zip&quot;&gt;MultiMarkdown Scripts&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;http://johnmacfarlane.net/pandoc/&quot;&gt;Pandoc&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;///m.huyng.com/uploads/documentation.css&quot;&gt;My Custom CSS for Documentation&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;http://www.tug.org/applications/pdftex/&quot;&gt;PDFLaTex&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;http://www.artofproblemsolving.com/LaTeX/AoPS_L_GuideLay.php#start&quot;&gt;LaTex Formatting Options Tutorial&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
</content>
   
 </entry>
 
 <entry>
   <title>The cloud is the new subprime mortgage</title>
   <link href="http://www.huyng.com/posts/cloud-caution"/>
   <updated>2009-01-23T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/cloud-caution</id>
   

   <content type="html">Dear Developers,

Please do not repeat the same mistakes that people in the housing market have made...

In looking at the current financial crisis, one can't help but draw similarities between the housing market and the tech sector. The housing crisis occurred, in a simplified view, because financial institutions piled layers upon layers of complex contractual obligations on top of a faulty asset: loans to people who couldn't pay them. While there were many players in the game, each relied on the middleman before them to guarantee that the loans wouldn't go bad. They diced, repackaged, and resold these loan contracts to unsuspecting buyers and to themselves while under the illusion that these assets were safe. In doing so, everyone believed they had much more money than they actually possessed. So when one weak link broke, their ever entangling and fragile web of dependencies came tumbling down.

Software in an unimaginative leap, is not so far from a contractual obligation, and data is the Internet's currency. With programming APIs to pull data from one service to another abound, developers willingly building mashable apps on cloud servers such as Google AppEngine and AWS, and with more people depending on these services than ever, its hard not to see that we are heading down the same path that finance took, a path toward the information infrastructure crash.

Although only recently have the cracks begun to show with the  announcement of the end of Google Notebooks and Jaiku, Pownce closing its doors, and Twitter reducing its availability to API developers, these seemingly innocuous events send a clear signal that these virtual streams of data and services can not be relied upon.

Similar to the Subprime crisis, when these services close, they bring down with them a slew of other companies that relied upon them. When they go down, they carry with them the thousand of man hours of free labor that users put into updating profiles, posting pictures, and comments on each other's pages. The web can only get so social, and maybe now is the time to consider a small dose of  self-reliance.

If not, people will be asking one day how we ever led ourselves to believe that companies like Google or Amazon were too big to fail.
</content>
   
 </entry>
 
 <entry>
   <title>How to use chmod</title>
   <link href="http://www.huyng.com/posts/how-to-use-chmod"/>
   <updated>2009-01-10T00:00:00-08:00</updated>
   <id>http://www.huyng.com/posts/how-to-use-chmod</id>
   
   <category term="programming" />
   

   <content type="html">Chmod is a command used to change file permissions. Given the right permissions, you can restrict or grant access to a file for specific groups or users. Here's the basic format:

&quot;chmod 755 myfile.txt&quot;

A file can have the following permissions corresponding to these numbers:

Read = 4
Write = 2
Execute = 1

In order to combine permissions, you just have to add the numbers together (ie, READ &amp;amp; WRITE permission is a 6). A file has to give these permissions to three groups of people. For example, each digit in the &quot;755&quot; of the chmod command above represents one of these groups:

Left-most Digit    = Owner
Middle Digit         = The file's group
Right-most Digit = All other users

so in the command above, we gave the owner read, write, and execute permissions; the group of the file got read and execute permissions; and all other users got execute permissions.

Hope that helps.
</content>
   
 </entry>
 
 
</feed>
