Skip to main content.
Wednesday, June 29th, 2005


The last couple of days I have been keeping an eye on the domain name “” It has been owned by a cybersquatter who would have wanted a pretty penny for it. I noticed that the registration was due to expire 28 June 2005 and marked it on my calendar. So, starting just after midnight in the timezone of the registrar, I have been trying to register it myself. I kept it up without success through the 29th and now it’s finally the 30th. Each time I tried, I was told that the name was taken and active, even though the whois databases indicated that the record expired on the 28th.

The registrar just updated its database to reflect that the record had been extended another year, and is still in possession of the cybersquatter. Rats! I guess he had some automatic renewal system set up and I never had a chance, which is what I should have expected. The only thing that was stringing me along was that non-updated date in the whois record.

Why would I want the domain name Well, one of my online handles, and the one I favor now, is altjira, an Australian aboriginal god who created the world and then just sort of left everyone to their own devices. Shows a little hubris, eh? I especially like the non-interference policy on the part of a creator deity. I usually leave it uncapitalized, which is both in keeping with Linux styling and a protest against the use of capitalization for deities even when referred to by pronouns, i.e “Him”.

I use it as my username wherever I can. I was especially happy to get it as my slashdot id and as a gmail account (which is declining in use since I realized how Big Brother gmail is, and besides, I now have email with my website.) I just thought that “” would be an ultra-cute email to have. For those in the know, it would be saying “you’re getting email from god in heaven.”

Posted by Greg as My Website, Posts About Me at 16:53 PST

Comments Off on Rats!

Google Fame New Start

UPDATE: Although it might be interesting from a technical point of view, I had to abandon the programming effort I describe herein because I found out that it violates the Google Terms of Service.

Well, as I discussed earlier, I have explored using php to generate Google searches and parsing the resultant html files to find the information I want. Parsing refers to taking a big chunk of data and analyzing it to extract needed information. In the case of my Google Fame plugin, I want to send search requests to Google and search through the results to find any links that go to my website. I also want to note Google’s estimate of how many results there are, and how far down the list the first reference to my website is.

Normally you would use a browser to connect to Google. Once connected, you can enter your search terms, maybe adjust a few parameters (like requesting only pages that are in English, for example) and search. The Google homepage is actually a form that you enter data into first, or you might use the Google Toolbar in Internet Explorer or the Googlebar extension in Mozilla or Firefox to pre-enter the form information and skip the Google front page. Either way, you’re sending a formatted request to the Google site, and Google analyzes that request, finds what you want, and sends the information back to you formatted as html code that your browser converts into a webpage.

I’m writing code that skips all the interactive, user-visible steps. I’m going to set up my own interface to determine what the search terms and parameters should be and converting that into a call to Google. For example, if I want to search using the terms “greg perry” and “san diego”, and I want 100 results each time, I could send the string

to Google and it would send me the results back. There are actually many more variables I can put into the request I send to Google, but I’ll keep it simple here.

But I don’t want to look at the results, I want my program to look at them for me and find want I want. So instead of letting my browser display the results, I capture the information that Google returns and put it into a string. It’s a long string – about 100,000 characters, or 100 KB – but nowadays that’s not a problem. Then I use other commands to search through the long string to find the information I want. At this point in time, Google will only give me up to 100 results per search, so if my website isn’t in the first try, I have to generate another call asking for the next 100, and so on, until I find my site, or reach the end of the list and determine that my site didn’t make it.

The commands in php use a complex technique of wildcards, originally developed in the programming language PERL, called “regular expressions” or regex. I found regex to be pretty difficult to understand at first, but using tutorials and samples I found on the web, I was able to cobble together some workable code. It may not be elegant, and it might give errors if unexpected results are encountered, but it’s enough for now.

Now, the Google API is designed to allow programmers to do this sort of thing without parsing the html files. You make program calls to the API and get preformatted results back from Google. The trouble that I found is that the results are wrong – sometimes seriously different from what you get if you go to Google in your browser and send in the same information. You also have to register with Google when you download the API and get a key that you have to include with your API calls – the key allows Google to associate those API calls with you, and the number of calls you can make in a day is limited. So for me, the API isn’t worth the programming steps that it saves.

What’s an API, you might ask? Here’s a good definition from Arizona State University:

Short for Application Program Interface, API is a set of routines, protocols, and tools for building software applications. A good API makes it easier to develop a program by providing all the building blocks. A programmer puts the blocks together. Most operating environments, such as MS-Windows, provide an API so that programmers can write applications consistent with the operating environment. Although APIs are designed for programmers, they are ultimately good for users because they guarantee that all programs using a common API will have similar interfaces. This makes it easier for users to learn new programs.

So far, I’ve created a program that uses a simple form to set up the Google search and go through the results. The program is set up to spit a lot of info back, including the contents of various variables I use so I can check that my code is functioning properly, and the complete listing of all the websites that Google found that match my search request. My program counts and numbers the matches and sets a flag when my site is found in the results. It stops sending calls to Google when it finds a reference to my site, or if I hit the limit of how many results Google is willing to provide. (While playing around with this, I found that Google doesn’t seem to give you more than 1000 results, even if it tells you that there are a lot more.) Then it tells me my website ranked X out of XX results (zero if I didn’t appear at all) which is all I’m really looking to know. You can play with my program so far if you want.

So now I’ve pretty much caught up to where I was when I started getting disgruntled with the Google API. Next, I want to expand the code I’ve written to interact with my online database so that it retrieves preset search terms from one place, gets the results, and stores them in another place in the database. Then I have to create the interfaces. I need two – one in my WordPress Administrator area (also called the “backend”) that allows me to put the search terms into the database; and one in my blog (the “frontend”) – I point again to the the space I marked out in my right sidebar – so that people can see the results. Somewhere I have to have a way of telling my program when to run. My choices are to have it run once a day all by itself; to launch it from my backend; or to make my frontend interface figure out when it’s me looking at it, and offer me a way to run it while I’m there, without bothering to go into the backend.

The next step after that is to package up the various programs as a plugin and make them available to other WordPress users so that they can use it, too. This might be the hardest part – I also have to write programs to install the plugin and set up the required elements (such as the database tables I use) that I just manually set up on my own website. I also need to register the project with WordPress and create an area in my website to make the download available to the public. This area will probably have to include documentation of the plugin and a forum for users. I’ll have to offer technical support for people who have trouble getting it to install and work properly, and respond to feature requests from people that want it to do something else or do it differently. How much work that will require depends on how well I program it in the first place, how popular it gets, and how close I am to anticipating what other people want.

It could end up being a lot of work. So why bother? Here’s what I anticipate getting out of it:

I just spent a significant chunk of time that I could have used programming to document my efforts. Well, that’s ok, too. I also enjoy writing as well.

Posted by Greg as My Website, Programming at 12:41 PST

1 Comment »