Look at your html page as xml data for the sake of SEO

I’ve been working on a script that goes to a URL and scraps some parts of data, which is pretty much a crawler or spider.

If all pages that the crawler landed were valid, my job would have been so easy. However, in reality many many pages are not valid and the script has to use regular expression.

This can be a good or bad thing for those web owners.

However, exposure is very necessary in terms of marketing for the site and valid html page means it has greater chance to get exposed by search engines such as google.com because valid html page will provide what search engine crawler wants more efficiently.

I believe engineers who work on those crawler have overcome many difficulties due to the invalid markup on a page. However, if HTML in a page is not valid (treating it as a xml), those smart engineers would have to come up with a logic to overcome that by using regular expression perhaps. That could be prone to mistakes so lead to scrapping only few from invalid HTML in a page. After all engineers are human and human make mistakes.

Also just for the same reason, if well formed semantic HTML is used, it will have higher chance to get exposed to a certain keyword typed by users.

That’s just my idea of how html page has to be constructed considering SEO and future use.

So my recommendation is this:

1. Treat markup in a page as data. Forget about presentation and such. Just make sure the data is valid.
2. Use CSS to visualize the data (= HTML markup) to appeal users

It’s quite simple after all.

Using .plist to get app-wide variables

In web application architecture, there is usually a file that has system configuration info.
In objective c development, there is a file called {name}.plist and it has some information regarding your application.

This is how you can get a specific information out of the file.


NSBundle *mainBundle = [NSBundle mainBundle];
NSString *myValue = [mainBundle objectForInfoDictionaryKey:@"myVariable"];

There you have it!

Battlestar Galactica: Blood & Chrome

I watched Battlestar Galactica: Blood & Chrome last night and felt it was too short. One thing I liked about Battlestar Galatica series was that the whole storyline weighed on philosophical views (and, of course, battle scenes). However, this short series did not have one (’cause it was too short).

At the end I enjoyed it because I am a huge fan of Battlestar Galactica. (I even liked Caprica so much)

Here’s link to amazon:
http://www.amazon.com/gp/product/B00BHNP3SI

PHP PSR-[0-3]

At my work place, we use PHP and CodeSniffer hooked up in Jenkins.
I liked the fact that PSR-1 and PSR-2 are approved by the committee.
The PSR0 standard is found at:
https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-0.md

The PSR1 standard is found at:
https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-1-basic-coding-standard.md

The PSR2 standard is found at:
https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-2-coding-style-guide.md

The PSR3 standard is found at:
https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-3-logger-interface.md

p.s. Kohana 3.3 has PSR-0 and the migration from 3.2 to 3.3 is quite tough for me. It requires a lot of refactoring on my side…

safari emoji list

http://www.grumdrig.com/emoji-list/

This works on safari only. sorry chrome and other browsers.

phantomjs on centos 6.2

Right now I have jenkins to do automation build and static code analysis for php projects. However, I need to look into client side javascript automation unit test suites and selenium server to do client-side functional test.

This post will be a long history of what I am going through (will include steps with failures…)

1. I downloaded phantomjs-1.7.0-linux-x86_64.tar.bz2 binary file from this page.

2. When I executed it, I got this error:
phantomjs: error while loading shared libraries: libfreetype.so.6: cannot open shared object file: No such file or directory

3. So I installed libfreetype.so.6:
sudo yum install libfreetype.so.6

4. I still get the same phantomjs error, which says libfreetype.so.6 is not found.

5. Looks like the libfreetype.so.6 was installed on /usr/lib, which is for 32 bit software and phantomjs needs one for 64 bit system.

6. So I created symlinks and placed them in /usr/lib64/

7. I get different error message (progress made!):
phantomjs: error while loading shared libraries: libfreetype.so.6: wrong ELF class: ELFCLASS32

8. It turned out that I just needed to install freetype:
yum install freetype

9. Now another dependency issue:
libfontconfig.so.1: cannot open shared object file: No such file or directory

10. “yum install fontconfig”. that installed correct lib files.

11. Got it installed. :)

[xxxxx@localhost ~]$ phantomjs –version
1.7.0

UPDATE Dec 6, 2012
Found this blog post for Phantomjs and qunit.

Another very resourceful answers at stackoverflow regarding phantomjs, ant, and jenkins.

Resources:

a good read on “javascript”

http://omar.gy/how-i-ended-up-enjoying-javascript/

facebook becomes another press mockery?

You remember when Yahoo gets ridiculed by bad presses due to low stock prices and they say Yahoo needs great products?

It seems like facebook is another one that gets ridiculed by them and people say facebook needs a big product that can get it back to where it stays above IPO price.

Funny how things are after IPO.

My hackintosh went grey after sleepenabler

So I had to basically reinstall the whole OS after enabling sleep feature on my hackintosh.

I should do more investigation on it first… oh well….

Yahoo news’ comments are fun to read

I see the resemblance of reddit in Yahoo news’ comments.
They get great amount of comments from so many users and similar to reddit’s.
It’s just my personal observation… that’s all.