Home
Warrior Tang's Journal
 
[Most Recent Entries] [Calendar View] [Friends]

Below are the 20 most recent journal entries recorded in Warrior Tang's LiveJournal:

    [ << Previous 20 ]
    Wednesday, July 8th, 2009
    8:40 pm
    The Facebook Panopticon, part 2

    Facebook just recommended my dad as a potential Facebook Friend. He lives on the other side of the country.

    See earlier for the background and et cetera as to why this is very strange.

    I'm leaning towards thinking that they linked me up to a data mine based on the email address that I used to sign up with the service. I want to know what their database looks like and what information they have on me.

    Current Mood: tired
    Current Music: The Cranberries - Zombie

    6:28 pm
    Notes from work - image path relative to CSS, MySQL table locking, and project LoC count

    The website has a header image that is displayed on all pages. Since the site has multiple directories of different depth, the original coders could not use an image tag because the relative path to the image is different from different directories. They could have chosen to use an absolute path, but they didn't. Their solution was to make the iamge path relative to the CSS file by loading the image in CSS.

    div#headerImg {
       background:url(../images/banner.jpg) ;

    The relative path to the CSS file is different for the different directories, but the original developers were content to hardcode the different depths to load the CSS file.

    I wanted to make the header image link to the site index page, but that is a little hard to do when there is no image tag to wrap the anchor tag around. If you know how to do it, it's only a little hard. I can't wrap an A around an image that is not there, but I can put one inside the div. Then it takes some CSS to make the anchor a clickable block.

    /* Allow clicking on the banner */
    #headerImg a { display: block; width: 100%; height: 84px;}

    Note: I added PHP code to autodetect the "../" depth, but the current header image system works so I haven't fixed it.

    MySQL table locking stupidity ) Counting the project's lines of code )

    Current Mood: geeky
    Current Music: Genesis - Land of Confusion
    Tuesday, July 7th, 2009
    8:56 am
    More notes from work - NULL and 0 in a SQL database

    I noticed something unusual thing about the database. The designers never use NULL. Where a NULL might be used, there is a zero in the field instead.

    I can see a potential reason for avoiding NULL because NULL complicates database querying logic. For example, it can be expected that a record with a NULL value will not show up in a list of "any number between one and ten". One might not expect that a list of "any number NOT between one and ten" will also leave out the NULL record! NULL is not a number or anything. It represents an unknown value, the lack of any information. To get every record where there is not a number between one and ten, the query would have to be along the lines of "number IS NULL OR (number <= 10 AND number >= 1)".

    A problem with using zero to mean "no data" is that there is no zero record in the corresponding table with the related data for that data field. This makes it difficult to add foreign keys to guarantee relational integrity. I could add a zero field and call it "Unknown", but then our web forms break. There is probably an easy solution to that but I have not looked into it yet. If I convert all the zero records to NULLs, I don't know what will break, if anything does.

    At first I thought the use of zero instead of NULL was too widespread for it not to be intentional. Then I discovered how the zeros were being added. It's not intentional, it's a MySQLism.

    In an earlier post, I mentioned how the original developers write all their inputs as 'strings' and that this should cause the inserts into numeric fields to fail in just about every database but MySQL. MySQL converts strings into numbers. I assume that '123abc' would be converted to 123, but I haven't confirmed this. If the insert routine got no information from the original data source, if it tries to insert '$foo' but $foo is empty, it will send MySQL a null string '' which MySQL converts to zero, not NULL.

    For the non-CS major, a null string is not NULL. The null string is a string of zero characters, while NULL is no string, no information.

    Example... )

    That explains the most common case. However, there are also a couple of places where zero was intentionally used instead of NULL: a database field which defaults to zero instead of NULL, and an update query which uses zero. I will have to try using NULL there and see what breaks.



    Current Mood: geeky
    Current Music: Butthole Surfers - Pepper
    Sunday, July 5th, 2009
    6:42 pm
    Interesting article on the Smithsonian redesign

    A historian gripes about the redesigned Smithsonian Museum resembling a "shopping mall". Excerpts:

    Whole huge exhibits have disappeared. The first floor used to house an exhibit entitled "Information Age: People, Information and Technology." This was a 14,000 square foot display with over 900 original artifacts: Samuel Morse’s telegraphs, Alexander Bell's telephones, a Hollerith punched card machine, a German ENIGMA encoder, the ENIAC computer, the TELESTAR test satellite, and a selection of early personal computers, among many other artifacts. This has all been crated-up and stored away in a warehouse somewhere and replaced by "Julia Child’s Kitchen."

    [...] one hugely popular and impressive exhibit in the “old” museum was the Foucault Pendulum (which was removed prior to the current renovations). The Foucault Pendulum consisted of a 52-foot cable suspended from the ceiling down through a round opening in the second floor, with a 240 lb. brass globe at the end. A row of candles was set up on the first floor, and the motion of the pendulum over the course of the day, as it knocked down the candles one-by-one, demonstrated the earth’s motion. This was a very physical—one is tempted to say, 19th century—way to communicate something fundamental about the physics of our planet. Now this sort of information is conveyed only on touch-screen video monitors.



    Current Mood: calm
    Current Music: Weird Al - Couch Potato
    Wednesday, July 1st, 2009
    7:28 pm
    Firefox 3.5, now with 10% faster Javascript!

    Firefox 3.5 is out, featuring an overhauled Javascript engine that is supposed to be much faster than it used to be.. Naturally, I want to see how it holds up on the simple little graphics demos that I wrote in Javascript for a computer graphics class because I couldn't get OpenGL to display anything and we didn't have GLUT (which was all that our OpenGL book taught) on the open lab machines anyways.

    There is a strange story beside that... )

    So I installed FF3.5 and I ran it on those Javascript graphics demos. A point of warning here before anyone mistakenly interprets these numbers to mean anything useful: benchmarks will usually stress one or two parts of a system without giving a comprehensive overview of how the whole system operates in normal practice. These tests in particular don't represent anything that people actually do with Javascript in normal practice. Times are in seconds, lower is better unless you're a psychic monster that feeds off impatience.

    Firefox 3.0.10Firefox 3.5Speedup
    Anti-Aliased Bresenham Line 2.202.009.7%
    Bresenham Line, Enterprise Edition 54.0149.559.0%
    Hermite Spline (disconnected) 15.5914.587.0%
    Hermite Spline (connected) 26.8626.840.1%
    Rotating Square 61.4756.997.8%
    Circle (part of Sphere) 2.02 0.96 110.3%
    Sphere 3.58 2.43 47.1%
    Triangle with moving lights 20.51 8.62 137.9%

    Average speedup by graphics type:

    • Image-based particles: 6%
    • Table-based buffers: 76%
    • Animations: 32% (almost all in Triangle)

    The new Firefox is faster than the old Firefox. It's an improvement. Don't look a gift horse in the mouth. In particular, the circle and sphere render much faster (although the aaline doesn't show the same improvement) and the developers fixed most of whatever was slowing down the triangle test.

    Now I'll look a gift horse in the mouth. The results from the new Firefox beat Opera on the hermite3 and bresline tests that Opera bogs down on, and it beat Chrome on the aaline and sphere tests, but otherwise none of its scores beat those of Opera, Chrome, or Safari. (Note: Chrome and Safari were tested on different computers, but FF3.0.x tested similarly on all three machines.) Of the three browsers, Firefox is still the slowest overall in these mostly useless, in no way real-world benchmarks.



    Current Mood: geeky
    Current Music: Butthole Surfers - Pepper
    Tuesday, June 30th, 2009
    8:06 pm
    Removing redundant code

    I waited to investigate one part of the source tree because it wasn't a high priority (as nobody seemed to use it) and it looked more complex than the rest. While most directories have one or two PHP files, this had an index PHP and five separate directories having three to five PHP files in each. When I did finally open up the files to see what they did, this turned out to be one of the simpler parts of the program and one of the most redundant.

    The five directories each correspond to a table in the database. Each directory contains separate 80-100 line PHP scripts for displaying the table, displaying the interface for editing a single record, displaying the interface for adding a new record, running the SQL to edit a single record, and running the SQL to add a new record. All five tables have a very similar schema: there is an ID field, a description, and two tables have a third field.

    There is lots of redundancy between the five directories, as the file that does an operation for table X is very similar to the file that does the same operation for table Y. One could (and I did) use variables to say which table is being operated on and what the names of the fields are. There is also redundancy between the files in each directory in handling CGI variables and displaying HTML. This can be alleviated by handling all the CGI variables in one place and printing all the common HTML in one place.

    The new system uses one front-end file of almost exactly 100 lines and one back-end file of about 90 lines including comments and sanity checks for the new variables. ~1900 LoC --> ~200 LoC and the directory tree goes away. That's a nice result for a few hours of work.

    I also used one of those do{...}while(false) constructs to easily break out of the backend file without halting the frontend file and while keeping all the variables in the same scope. It works for the purpose and lets the code logic flow in one direction.

    One of the things that brought me to this part of the code is because I'm going through the whole codebase to make all the files share a common source for the HTML header and the opening body and containers. The code was originally written so that each separate PHP file had its own HTML headers. That is being fixed. The whole program likely started as three or so PHP files and grew through copying and pasting as new requirements came in to the few dozens that there are now.



    Current Mood: geeky
    Current Music: Ozzy Osbourne - Mama I'm Coming Home
    Sunday, June 28th, 2009
    11:17 am
    ID4's writers
    The movie Independence Day was on TV last night. I noticed that the dialogue was excellently written. The characters' lines are very expressive of their character and contribute to the plot while being clear, snappy, and often humorous. I wondered who the writers were and what else they have done. Three cheers for IMDB! The writing is credited to the film's producer and director, Roland Emmerlich and Dean Devlin. They were also responsible for the Stargate movie and the American Godzilla movie, and Emmerlich wrote The Day After Tomorrow (the movie where the Northern Hemisphere gets flash-frozen), but they have not done much else of note.

    Current Mood: blah
    Current Music: Heart - Crazy On You
    Thursday, June 25th, 2009
    10:13 pm
    Variable naming in borrowed code, and other mini-wtfs

    In some places it is obvious that the code I'm working with is copied and pasted from a web forum example. I am in favour of borrowing code from other people and projects so long as their legal rights are respected, and code that someone posts to a web forum ought to be considered public domain. What annoys me is the use of unique and expressive variable names like $flightdateListMonth that mean something to the original program but not this one which, for example, does not have flights.


    One file uses multiple HTML tags with the same @id. The @id is supposed to be unique to one node in the entire document, but the code uses it like a class. This works in Firefox, but it is not linguistically correct.


    Doing it wrong in both PHP and HTML at the same time:

    "<option id='Year' value=\"".$y."\">".$y."</option>"

    It looks like the developer got single and double quotes confused. A single-quoted version would work like this:

    '<option id="Year" value=".$y.'">'.$y.'</option>'

    The benefit of using double quotes is that variables are interpreted inside them so you do not need to concatenate.

    "<option id=\"Year\" value=\"$y\">$y</option>"


    Current Mood: blah
    Current Music: Moody Blues - Story In Your Eyes
    9:38 am
    You Really Can't Do That On Television

    According to IMDB:

    The 1987 "Adoption" episode was never seen in the U.S. again following its original airing, and it never aired in Canada. Among the scenes that led to the banning is the one where Valerie and Lance adopt Doug because it was cheaper than buying a dog. The studio master of this episode has a large label on it reading "DO NOT AIR". By 1987, there were fifteen episodes pulled from the rotation for Nickelodeon: the banned "Adoption" episode, all thirteen 1981 episodes, and the 1982 "Cosmetics" episode. Including Alasdair's "Crusher Wallace, the school bully" and the censored locker monster skits. While telling jokes, one of the kids is eaten alive in front of the cast.

    According to Wikipedia:

    two rather risque sketches were cut and replaced with less offensive sketches from dress rehearsal:
    • An "Opposite" sketch where a teacher shows his anatomy class a pornographic movie.
    • A sketch where Ross sells his father's issues of Playboy magazine to Ben and Alasdair.


    Current Mood: amused
    Current Music: Monty Python - I Bet They Won't Play This Song On The Radio
    Saturday, June 20th, 2009
    9:39 pm
    The hard sell
    There was a spam in my inbox with the subject line "Open or die in hell".

    Current Mood: amused
    Current Music: Tonic - Queen
    11:28 am
    Notes from work

    Upon opening up a PHP file, I found a very large comment block from the previous maintainer who recommend adding an $additionalSQL variable to the code and seeing if "the huge if/else blocks can be reduced to a few if(){ $additionalSQL = " and ..." } blocks".

    For each input to the script, the code had if/else to generate different SQL WHERE clauses depending on if the previous inputs were received. In many cases, the only difference in the SQL in each branch was the presence or absence of "and " at the start. Fixing it by the last maintainer's recommendations brought a 22KB file down to 12KB and made it much easier to read.


    The same source file had a 4-deep nested SQL query which ground my workstation to a halt for two minutes while it ran. The original programmer was familiar with WHERE foo IN (select ...) but not JOIN. In other words, the SQL developer knew set theory but not SQL. I made it a 2-deep query of half the size and it runs nicely now.


    The same source file had a strange method to tell if any of several tests ran. There is a value called $nothing_happened which is incremented if a test is not run, and then this value is compared to the total number of tests in the code (manually counted) to make sure they are different. I made it a boolean that is set to true when any test does run. Instead of checking $nothing_happened against a number that needs maintenance if the code changes, the code now works so that $nothing_happened is true if nothing happened. That's more sensible, eh?


    To produce several CSV field headers at once, the code does this:

    fwrite($fp, "User Name".',');
    fwrite($fp, "Start Date".',');
    fwrite($fp, "End Date"."\n");

    That can all be replaced with one fwrite() call and one string.


    Strange programming practice: throughout the program, integers in SQL code are treated as 'strings'. If I am not mistaken, this behaviour breaks every database except for MySQL because trying to insert a string value into an integer container produces a data type mismatch. The code also puts field and table names in `backquotes`, which is a MySQL-only syntax. At first I thought the original programmers must know better than me, now I'm pretty sure they're doing it wrong and/or just copying stuff from phpMyAdmin which puts backquotes on everything.


    The program includes a large Javascript source file. Part of the largeness is repeated code. There was one original 50-line function that was copied and tweaked a little bit for a slightly different purpose. Twice. By giving the original function two extra variables, a parameter to branch on, and 10 lines of branching code, the two newer functions could be reduced to one-liners that just call the first function.

    There was also some repeated code to toggle the visibility of features on the page. Each feature has its own fooIsVisible variable, its own data variable, and its own toggle function. Object-orientation to the rescue! I made a class containing all of the above and made instances of the class for each feature. The global toggle functions are now one-liners and I might be able to get rid of them entirely if the instances are in the scope of the code that uses these functions.

    All in all, I cut the Javascript down by 150 lines. I love being able to measure my productivity in negative lines of code.

    Current Mood: geeky
    Current Music: Weird Al - Couch Potato

    Thursday, June 18th, 2009
    7:28 pm
    WaPost fires Froomkin
    The Washington Post fired blogger Dan Froomkin, one of their most popular writers. I liked reading him.

    Current Mood: blah
    Current Music: Green Day - Uptight
    11:11 am
    Wednesday, June 17th, 2009
    11:02 pm
    Civil unrest in Iran
    Iran is in a situation that could possibly lead to another revolution. Mahmoud Ahmadinejad won re-election in the first round with an improbable 63%-34% victory over the nearest candidate Mir Mousavi, with other candidates gaining negligible amounts. No one expected it to be that big a blowout, protests broke out, and Mousavi endorsed further protests.

    Current Mood: tired
    Current Music: The Beatles - The Ballad of John and Yoko
    Tuesday, June 16th, 2009
    8:46 pm
    Opera's latest web browser is also a web server

    This is interesting...

    Rather than compete with the cloud-based services that are currently so popular, Opera is proposing, and enabling, a return to how the internet used to work: everyone runs their own host device, with their own applications running on their own hardware, which can then be accessed from anywhere using any web browser.

    Immediate problems that I see: browser incompatibility (duh), keeping the served data available online somewhere after you turn your system off, and the case of wanting to run the browser or the server but not both. At least they have name resolution figured out:

    Routing is handled by servers at Opera, and the computer on your desk is addressed as "unite://computername.username.operaunite.com".


    Current Mood: calm
    Current Music: Portishead - Glory Box
    Wednesday, June 10th, 2009
    8:51 pm
    The Facebook Panopticon

    I have a seekrit Facebook account to keep in touch with a few school buddies who only keep in touch with each other over Facebook. Being my typically paranoid self, I have as little of my personal information on there as possible: fake birthdate, fake name, and so on. Today, Facebook offered up a suggestion that I add as a friend an ex-coworker at the place I used to work at two years ago. I have no Facebook connection to the workplace, just to the school buddies. None of the school buddies worked at that place. The ex-coworker got out of school and married and settled years ago.

    This leads to the question: How the hell did they do that? Also, which departments of which governments and corporations have access to this technology? And how can I get my hands on this power to abuse it for my own selfish interests?

    I assume that part of the formula is based on my IP-derived geographic location, since that's about the only thing I can think of that would match and be in the Facebook database. I doubt that it's matching on techie keywords from wall talk because I've been posting twitter shit. Facebook does have one real e-mail address of mine that they could probably match to a third-party data mine that might have my whole life history in it, or they could use it to search the web and pull down information tying me to the job and the ex-coworker.

    Current Mood: uncomfortable
    Current Music: Europe - The Final Countdown

    Tuesday, June 9th, 2009
    8:01 pm
    w3tardedness: Form method attributes must be lower case

    From HTML 3.2

    method
    When the action attribute specifies an HTTP server, the method attribute determines which HTTP method will be used to send the form's contents to the server. It can be either GET or POST, and defaults to GET.

    From HTTP 1.1:

    5.1.1 Method

    The Method token indicates the method to be performed on the resource identified by the Request-URI. The method is case-sensitive.

    Method         = "OPTIONS"                ; Section 9.2
                   | "GET"                    ; Section 9.3
                   | "HEAD"                   ; Section 9.4
                   | "POST"                   ; Section 9.5
                   | "PUT"                    ; Section 9.6
                   | "DELETE"                 ; Section 9.7

    From the w3c Validator:

    value of attribute "method" cannot be "POST"; must be one of "get", "post"

    What seems to have happened is that during the switchover to XHTML back around 2000, the decision was made to make everything lower case including options that were supposed to be upper-case parameters for another case-sensitive protocol. The form method attribute is just a layer of indirection now; "get" translates to "GET" but you aren't supposed to know that, you are just supposed to use "get". In the w3c's defense, it would have been odd to have the set of values for this one attribute remain upper-case while the rest of the language went to lower case. I still think the decision to make HTML case-sensitive was ridiculous.

    Current Mood: blah
    Current Music: The Mighty Mighty Bosstones - Nevermind Me

    7:43 pm
    More minor code WTFs

    I code my php the old-fashioned way where the include goes at the top of the file and the globals that are to be imported into a function's scope have to be redeclared in that function.

    include('backend');
    func(){
      global $g_foo;
      ... // rest of function
    }
    func2(){
      global $g_bar;
      ... // rest of function
    }

    The code I'm working with uses a looser style that re-includes the needed backend files at the beginning of each function, which brings all of their globals into the current scope without having to declare them individually.

    func(){
      include('backend');
      ... // rest of function
    }
    func2(){
      include('backend');
      ... // rest of function
    }

    Disaster occurs when the two styles are mixed. When a page with a global include runs a function that repeats the include, the repeated include can break things. If the include in the function uses require_once() or include_once() and the file had been included globally by earlier code, the file is not included and the function does not get the globals it needs; but if anything other than *_once() is used when the file had been included before, any global code in the file will be run twice.

    There is a third include style both of us use which assumes the files containing certain functions to have been loaded by earlier code, so the include is left out entirely.


    Another wtf: lots of single quotes in output HTML.

    if(condition){
      echo "
      <li><a href='dest.php' class='foo' bar='baz'>link</a></li>
      ... etc ...
      ";
    } else if(condition2){
      echo "
      <li><a href='dest.php' class='foo' bar='baz'>link</a></li>
      ... etc ...
      ";
    }

    This avoids having to \escape every quotation mark but it produces invalid HTML that is not supposed to function correctly in any browser. We are lucky enough that the common browsers support this syntax, but this made my Graphviz script miss the links when I was trying to generate a logic flow of the website. I replaced all the HTML chunks with <<<HEREDOCs using proper quoting.

    Then I noticed it was the same block of HTML code being repeated after each condition, sometimes with very minor changes that could be handled with variables. I'll have to look into simplifying that later.

    Current Mood: geeky
    Current Music: The Mighty Mighty Bosstones - 128

    Wednesday, June 3rd, 2009
    9:50 pm
    A small WTF

    Here is an interesting construct:

    do { // once, with early exit possible
            if(isAdmin()){
                    $state = 4;
                    break;
            }
            if($state == 0){
                    $state = 1;
                    break;
            }
            if($err){
                    $state = 3;
                    break;
            }
            $state = 2;
    } while(FALSE); // do once
    

    It's like they forgot that the else statement existed.

    The construct found a use later in the code where the branches were based on the results of SQL queries and other things that needed a few lines of code to prepare them.

    do { 
      $query1=prepare_stuff()
      if(mysql_query($query1)){
        break;
      }
      $query2=prepare_stuff()
      if(mysql_query($query2)){
        break; 
      // etc. 
    while(FALSE);
    

    The preparation code is kept closest to where it is used, making the code easier to read in this case. If if-else had been used, all of the preparation code would have been at the top in one big chunk.

    Current Mood: tired
    Current Music: Queensryche - Walk in the Shadows

    7:21 pm
    It's conservative head explodey time

    Via Olbermann: An armed shopkeeper foiled a robbery! Score one for self defense. Then he converted the robber to Islam. Wait, what? Righty heads go bewm!


    And this one will confuse the conservapedia set: chimpanzees use five distinct wooden tools to collect honey from beehives.

    Current Mood: blah
    Current Music: Natalie Merchant - Wonder

[ << Previous 20 ]
My Website   About LiveJournal.com

Advertisement