12 Mar

How to create an SVG data visualisation with PHP

By Luis Freitas

A great article by Brian Suda in NETMAGAZINE, that i liked so much so i transcribed here the most interesting stuff:

Using code to create beautiful visualizations saves time, effort and allows you to focus on the idea rather than implementation details. Brian Suda explains how he wrote a PHP script to build an SVG graphic based on the .net magazine covers.

A good programmer is a lazy programmer. In a growing world of data visualizations, handcrafting each design won’t scale. I wanted to see if there was a way to create aspects of visualizations using a program, to make the output easy to create and reusable. Much like the UNIX philosophy of loosely joined programs, was it possible to build up a small toolbox of loosely joined scripts that create building blocks of visualizations?

There doesn’t seem to be an accepted term for this. I have heard the phrases “deterministic design”, “programmatic design” and “computational design”. What they are all aiming for is the ability to consistently create some graphic element based on data input.

We see this in the real world all the time. The old mercury thermometers took data from the environment, the temperature, and converted that into a visualisation by moving the mercury up the pipette. What is the digital equivalent?

There are a few candidate technologies. If you’re working online, then canvas springs to mind. It allows you to draw raster graphics quickly and easily. If you want, there are also plenty of image code libraries that can generate GIFs, JPEGs and PNGs on request. But what if your target isn’t always online? What if you’re aiming for print? Then you could use a raster graphic, but it would need to be pretty large. A better solution is to create a vector-based image format from your code. This is where SVG (Scalable Vector Graphics) steps in.

    Knowledge needed: Intermediate PHP, basic SVG/XML knowledge
    Requires: Text editor, PHP, SVG viewer
    Project time: 2-3 hours
    Download source files

Using code to create beautiful visualizations saves time, effort and allows you to focus on the idea rather than implementation details. Brian Suda explains how he wrote a PHP script to build an SVG graphic based on the .net magazine covers

A good programmer is a lazy programmer. In a growing world of data visualizations, handcrafting each design won’t scale. I wanted to see if there was a way to create aspects of visualizations using a program, to make the output easy to create and reusable. Much like the UNIX philosophy of loosely joined programs, was it possible to build up a small toolbox of loosely joined scripts that create building blocks of visualizations?

There doesn’t seem to be an accepted term for this. I have heard the phrases “deterministic design”, “programmatic design” and “computational design”. What they are all aiming for is the ability to consistently create some graphic element based on data input.

We see this in the real world all the time. The old mercury thermometers took data from the environment, the temperature, and converted that into a visualization by moving the mercury up the pipette. What is the digital equivalent?

There are a few candidate technologies. If you’re working online, then canvas springs to mind. It allows you to draw raster graphics quickly and easily. If you want, there are also plenty of image code libraries that can generate GIFs, JPEGs and PNGs on request. But what if your target isn’t always online? What if you’re aiming for print? Then you could use a raster graphic, but it would need to be pretty large. A better solution is to create a vector-based image format from your code. This is where SVG (Scalable Vector Graphics) steps in.

What is SVG?

SVG is the little technology that could! It is over 10 years old and the specification is still being refined. SVG was designed and redesigned with an XML web in mind. SVG Tiny was set to take the mobile world by storm, but it never did. The web would have zoomable graphics that could scale with your responsive web design, but few browsers natively support SVG. Looking around, you’d think that SVG failed and is a write-off. The web’s loss is a programmer’s gain!

SVG is an XML-based language used to describe vector graphics. There are a handful of primitives that you need to know, like line, circle, rectangle and path. From these you can build up much more complex images. Since XML is just text, you can write an SVG file in any text editor. Even simple programming scripts can quickly output SVG. Since the format is text, it’s possible to get in and tweak it even after your code is done… If you don’t like the colour, then open up Notepad and find and replace it.

SVG is a beautifully simple thing to learn. For many years I never got into programming graphics because you needed to know about image formats and it was like a foreign language, but with SVG it’s just text. The web won because you could “View Source”; SVG is a great image format because you can do the same!

Using code to build SVG

Building up SVG from code has its advantages. You can mathematically produce exacting results each time. Using algorithms you can quickly space out objects at exact intervals. I’ve seen designers try to snap objects to lines or use the rules to measure out distances, only to zoom way in, check it, zoom way out and still have it off my sub-millimetre tolerances. In code, this can all be ignored. Incredibly complex curves can be explained away with a few lines in a path and best of all, it is reproducible over and over again. Generating 10,000 random dots in a script is one for loop. By hand, this would take lots of copying and pasting and then it wouldn’t be truly random. The power of code to quickly generate designs that are either too exacting or too tedious for designers is the sweet spot.

I’m not advocating taking away the design of any graphics from the professionals, but rather letting code do what it does best. A good programmer is a lazy programmer. A good designer should be lazy. There is no need to create 10,000 random dots by hand. Your time could be used in much better ways.

Building up SVG via code is the quickest way to get a base that can easily be imported into more complex vector software such as Adobe Illustrator or Inkscape. From there, the design can be shaped further to match the needs of each unique project.

Examples

If we take an example, things will be much clearer. I see great designs all around and wonder how they did it and how, or if, it could be done in code. One of my favourites is this poster for WIRED magazine’s 15th anniversary. In its own right it is a beautiful thing to look at, but until you understand it, you don’t get the subtle reference. Each of the colour wheels represents the major colours of each issue’s cover. You can see over time how they went through dark periods and bright coloured periods. I thought to myself how I might go about doing something like this and wrote a simple PHP script that would take a .net magazine cover and produce a similar effect.

Making the .net covers visualizations

For those of you not familiar with PHP, SVG or how to view them, it is pretty easy. I’ll walk you through the code and show you how to view SVG in the browser and in a text editor. If you want to use your own favorite language, it shouldn’t be hard to follow along.

The first thing we need to do is load up the JPG image we want to analyze. In PHP, you can use the imagegreatefromjpeg() function if you have the proper libraries installed. This returns an image handle so we can ask further question about the graphic.

The next thing we’ll do is get the height and width of the image by using imagesx() and imagest() functions.

Our goal is to look up every pixel color in the image and count the frequency of each. So we’ll need some sort of array. In this case, I created an array called $rgb = array(), which I can make a new key for each colour and increment the value as a counter.

Since we have the height and width of the image, we can make two nested for loops. We can now go column by column looking at each x and y pixel by using the imagecolorat() function. The line $rgb[imagecolorat($im, $i, $j)]++; is accessing the rgb array, at the key equal to the pixel value, and adding one to that value. When the two for loops are finished we will have looked at every single pixel, making a nice, compact array of just the known colours and their frequency.

Finally, we’ll sort this array with asort() so the most popular colors are at the end and the smallest values at the start.

The code in PHP looks like the following:

<?php
// Fetch the JPEG image from a file
    $im = imagecreatefromjpeg("213.jpg");

    // Get the height and width based on the x,y values
    $x = imagesx($im);
    $y = imagesy($im);

    $rgb = array();
    $counter = 0;
    $scaler = 10;

// get colour count frequency by looping through the image column by column
    for($i=0;$i<$x;$i++){
    for($j=0;$j<$y;$j++){
// get that pixel's RGB value and store it in an array
        $rgb[imagecolorat($im, $i, $j)]++;
      }
    }

// Sort the array
    asort($rgb);

// release the image from memory
    imagedestroy($im);
    ?>

At this point our $rgb array is full; we no longer need the source image and we are going to create a new visualisation in SVG based on the data.

SVG is XML-based, which means it is just text. So we can simply echo out SVG code and see the results. The first thing we should do it let the browser know that this is SVG rather than plain text. In that case we need to use the header() function with the appropriate content-type “image/svg+xml”. Now browsers or other applications can use this to render it properly. It might work without this, but it’s better to be a good net citizen and output this if you can.

After that, we print out the XML declaration and SVG DOCTYPE.

Now we can actually start to get to the SVG part that’s specific to our image. Much like any HTML page has a root of <html>, SVG has a root of <svg>.  This takes a few parameters such as the height and width of the final image. Since we don’t always know how big our source image is going to be, it’s easier to just make this 100% for each value.

The logic for outputting the design is pretty simple. We will take the smallest occurring colour value in the image and make a circle that’s equal to the height times the width.

It doesn’t seem intuitive that the largest possible circle is the smallest possible value! What we do next will help clear this up.

This large circle is our base layer. We’ll put this down first and we’ll stack additional circles on top of this, getting slightly smaller each time. In the end, the only thing visible from our huge starting circle will be a tiny sliver around the edge.

To accomplish all this, we’ll loop through the $rgb array extracting the key, which is the colour, and the value, which is the frequency. We can do this with foreach($rgb as $k=>$v). The next few lines split the RGB value into an $r, $g, $b value ready for converting to hex. In PHP you have dechex() function, which takes a decimal number and creates a hex equivalent. We also need to pad the string with leading zeros in case the colour value is less than 16. Putting them altogether we get the $hex value.

Up until this point, we haven’t even outputted any SVG graphical element. This will be our first, a circle. In SVG to make a circle, you use the <circle> element with some attributes. These attributes describe both the circle and its position. The attributes cx and cy are the circle’s centre point on the x,y grid. The attribute r is the radius and fill is a hex colour that you want the circle to be filled with. Making graphical elements couldn’t be much easier.

Now we need to take what we’ve learned and apply it to our array of colours. There are two variables we need to keep track of: the maximum size of the circle and the size of the previous circle. Also, to make sure things don’t get out of hand size-wise, I’ve added a $scaler variable to keep the size from exploding too large.

The variable $c is the maximum size that we will continually decrease the radius from. Since $c is the width times the height, every pixel in the original graphic is represented in the radius of the circle. The next variable $prev is the size of the previous circle we drew. That way, we can slowly decrease the size of the radius based on the number of pixels we’ve made circles for already.

<?php
    header('Content-Type: image/svg+xml');
    echo '<?xml version="1.0" standalone="no"?>
    <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
    "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
    <svg width="100%" height="100%" version="1.1"
    xmlns="http://www.w3.org/2000/svg">';

    $c = (int)(($x*$y)/$scaler);
    $prev = 0;
    foreach($rgb as $k=>$v){
      if($v > 0) {
        $r = ($k >> 16) & 0xFF;
        $g = ($k >> 8) & 0xFF;
        $b = $k & 0xFF;
        $hex = str_pad(dechex($r),2,'0',STR_PAD_LEFT).str_pad(dechex($g),2,'0',STR_PAD_LEFT).str_pad(dechex($b),2,'0',STR_PAD_LEFT);
        echo '<circle cx="'.$c.'" cy="'.$c.'" r="'.($c-$prev).'" fill="#'.$hex.'" />';
        echo "\n";
        $prev += (int)($v/$scaler);

      }
    }

    echo '</svg>';
?>

If we look at some example output we will quickly see that every tiny shade of each colour gets its own ring. This means anti-aliased text creates plenty of shades of grey that we should collapse into a single representative colour. This will reduce the number of rings, but at the same time it is clearer which colours stand out the most. We could simply take the top five most popular colours, but what normally happens is you get similar shades of the same colour rather than representing the full spectrum. A better way is to try and promote colours with similar values into the most popular neighbour. To do this, we need to write a simple function which we insert before the SVG output.

    $rgb = reduceColors($rgb);

Passing in the list of $rgb colours we get their value and frequency. The idea is to loop through this full list and create a new list with the reduced colours. I have chosen a plus or minus range of 75. This means an RGB value of 100,100,100 would get promoted into the most popular colour, which is anywhere from 25,25,25 to 175,175,175. You can adjust this value to be more forgiving or more strict. Depending on your values, you will end up with more or less rings.

First, we need to reverse sort the array so the most popular colours are first. This means we are promoting into the most popular rather than down to the least. As we loop through the $rgb array, we also need a sub-loop through the $temp array. If this is a new colour we haven’t seen before, we put it into the new $temp array. Otherwise, as we loop through we can add it to the first $rgb value that we find matching our plus/minus range. At the end, we resort the array so the smallest is first and return it back so we can output the SVG.

    function reduceColors($rgb){
            $plusminus = 75;
            arsort($rgb);
            $temp = array();
            // do colour merger
            foreach($list as $k=>$v){
              if($v != 0){
                $r = ($k >> 16) & 0xFF;
                $g = ($k >> 8) & 0xFF;
                $b = $k & 0xFF;

                    $matched = false;
                foreach($temp as $m=>$n){
                  if($m != $k){
                    $rs = ($m >> 16) & 0xFF;
                    $gs = ($m >> 8) & 0xFF;
                    $bs = $m & 0xFF;

                    if (
                      ($rs <= ($r+$plusminus))&&($rs >= ($r-$plusminus)) &&
                      ($gs <= ($g+$plusminus))&&($gs >= ($g-$plusminus)) &&
                      ($bs <= ($b+$plusminus))&&($bs >= ($b-$plusminus)) &&
                              $matched == false
                      ) {
                                            $temp[$m] += $v;
                                            $matched = true;
                    }
                  }
                }
                    if(!($matched)){
                            $temp[$k] = $v;
                    }
              }
            }

            asort($temp);
            return $temp;
    }

Instead of using the function, another option would be to use the built-in function imagetruecolortopalette(), which can take a maximum number of colours for the colour palette. This will fix the number of possible colours in the rings and do the colour reduction for you. The results may or may not be what you intended, but it is an easier alternative.

Improvements

There are plenty of ways this could be improved. A better colour clustering algorithm, some optimised looping, maybe invert it so the most popular colours are on the outer ring instead of the inner. This was designed as just a quick starter into the world of dynamically generated visualisations. From here on, you need to take your creativity and see what you can apply it to.

SVG is hardly the scary technology that you might have thought. Given that it is simply an text, XML format, you can write scripts in your favourite language to quickly and easily generate output. Running these scripts on different data produced different output, but the underlying code remains the same, allowing you to quickly and easily prototype new designs. Knowing SVG and how to script its output is another tool in your toolbox useful in many aspects at work and play.

Here is the source code.

23 Jan

How to use sitecopy to mirror an old html website

By Luis Freitas

Well tonigh i has a small job to recover an old website. So, i didn’t need a fancy tech like rsync (wich i would use if it was a critical work). So, tonight i’ll use my (very) old friend, sitecopy.

Here it is how to use site copy to recover an old html website:

sitecopy

Name

sitecopy – maintain remote copies of web sites

Synopsis

sitecopy [options] [operation mode] sitename

Description

sitecopyis for copying locally stored web sites to remote web servers. A single command will upload files to the server which have changed locally, and delete files from the server which have been removed locally, to keep the remote site synchronized with the local site. The aim is to remove the hassle of uploading and deleting individual files using an FTP client. sitecopy will also optionally try to spot files you move locally, and move them remotely.

FTP, WebDAV and other HTTP-based authoring servers (for instance, AOLserver and Netscape Enterprise) are supported.

Getting Started

This section covers how to start maintaining a web site using sitecopy. After introducing the basics, two situations are covered: first, where you have already upload the site to the remote server; second, where you haven’t. Lastly, normal site maintenance activities are explained.

Introducing the Basics

If you have not already done so, you need to create an rcfile, which will store information about the sites you wish to administer. You also need to create a storage directory, which sitecopy uses to record the state of the files on each of the remote sites. The rcfile and storage directory must both be accessible only by you – sitecopy will not run otherwise. To create the storage directory with the correct permissions, use the command

mkdir -m 700 .sitecopy

from your home directory. To create the rcfile, use the commands

touch .sitecopyrc
chmod 600 .sitecopyrc

from your home directory. Once this is done, edit the rcfile to enter your site details as shown in the CONFIGURATION section.

Existing Remote Site

If you have already uploaded the site to the remote server, ensure your local files are synchronized with the remote files. Then, run

sitecopy --catchup sitename

where sitename is the name of the site you used after the site keyword in the rcfile.

If you do not have a local copy of the remote site, then you can use fetch mode to discover what is on the remote site, and synchronize mode to download it. Fetch mode works well for WebDAV servers, and might work if you’re lucky for FTP servers. Run

sitecopy --fetch sitename

to fetch the site – if this succeeds, then run

sitecopy --synch sitename

to download a local copy. Do NOT do this if you already have a local copy of your site.

New Remote Site

Ensure that the root directory of the site has been created on the server by the server administrator. Run

sitecopy --init sitename

where sitename is the name of the site you used after the site keyword in the rcfile.

Site Maintenance

After setting up the site as given in one of the two above sections, you can now start editing your local files as normal. When you have finished a set of changes, and you want to update the remote copy of the site, run:

sitecopy --update sitename

and all the changed files will be uploaded to the server. Any files you delete locally will be deleted remotely too, unless the nodelete option is specified in the rcfile. If you move any files between directories, the remote files will be deleted from the server then uploaded again unless you specify the checkmoved option in the rcfile.

At any time, if you wish to see what changes you have made to the local site since the last update, you can run

sitecopy sitename

which will display the list of differences.

Synchronization Problems

In some circumstances, the actual files which make up the remote site will be different from what sitecopy thinks is on the remote site. This can happen, for instance, if the connection to the server is broken during an update. When this situation arises, Fetch Mode should be used to fetch the list of files making up the site from the remote server.

Invocation

In normal operation, specify a single operation mode, followed by any options you choose, then one or more site names. For instance,

sitecopy --update --quiet mainsite anothersite

will quietly update the sites named ‘mainsite’ and ‘anothersite’.

Operation Modes

-l, –list
List Mode – produces a listing of all the differences between the local files and the remote copy for the specified sites.
-ll, –flatlist
Flat list Mode – like list mode, except the output produced is suitable for parsing by an external script or program. An AWK script, changes.awk. is provided which produces an HTML page from this mode.
-u, –update
Update Mode – updates the remote copy of the specified sites.
-f, –fetch
Fetch Mode – fetches the list of files from the remote server. Note that this mode has only limited support in FTP – the server must accept the MDTM command, and use a Unix-style ‘ls’ for LIST implementation.
-s, –synchronize
Synchronize Mode – updates the local site from the remote copy. WARNING: This mode overwrites local files. Use with care.
-i, –initialize
Initialization Mode – initializes the sites specified – making sitecopy think there are NO files on the remote server.
-c, –catchup
Catchup Mode – makes sitecopy think the local site is exactly the same as the remote copy.
-v, –view
View Mode – displays all the site definitions from the rcfile.
-h, –help
Display help information.
-V, –version
Display version information.

Options

-y, –prompting
Applicable in Update Mode only, will prompt the user for confirmation for each update (i.e., creating a directory, uploading a file etc.).
-r RCFILE, –rcfile=RCFILE
Specify an alternate run control file location.
-p PATH, –storepath=PATH
Specify an alternate location to use for the remote site storage directory.
-q, –quiet
Quiet output – display the filename only for each update performed.
-qq, –silent
Very quiet output – display nothing for each update performed.
-o, –show-progress
Applicable in Update Mode only, displays the progress (percentage complete) of data transfer.
-k, –keep-going
Keep going past errors in Update Mode or Synch Mode
-a, –allsites
Perform the given operation on all sites – applicable for all modes except View Mode, for which it has no effect.
-d MASK, –debug=KEY[,KEY...]
Turns on debugging. A list of comma-separated keywords should be given. Each keyword may be one of:
socket Socket handlingfiles File handlingrcfile rcfile parser

http HTTP driver

httpbody Display response bodies in HTTP

ftp FTP driver

xml XML parsing information

xmlparse Low-level XML parsing information

httpauth HTTP authentication information

cleartext Display passwords in plain text

Passwords will be obscured in the debug output unless the cleartext keyword is used. An example use of debugging is to debug FTP fetch mode:

sitecopy –debug=ftp,socket –fetch sitename

Concepts

The stored state of a site is the snapshot of the state of the site saved into the storage directory (~/.sitecopy/). The storage file is used to record this state between invocations. In update mode, sitecopy builds up a files listfor each site by scanning the local directory, reading in the stored state, and comparing the two – determining which files have changed, which have moved, and so on.

Configuration

Configuration is performed via the run control file (rcfile). This file contains a set of site definitions. A unique name is assigned to every site definition, which is used on the command line to refer to the site.

Each site definition contains the details of the server the site is stored on, how the site may be accessed at that server, where the site is held locally and remotely, and any other options for the site.

Site Definition

A site definition is made up of a series of lines:

site sitename
server server-name
remote remote-root-directory
local local-root-directory

[
port port-number ]
[
username username ]
[
password password ] [
proxy-server proxy-name
proxy-port port-number ] [
url siteURL ]
[
protocol { ftp | webdav } ]
[
ftp nopasv ]
[
ftp showquit ]
[
ftp { usecwd | nousecwd } ]
[
http expect ]
[
http secure ] [
safe ]
[
state { checksum | timesize } ] [
permissions { ignore | exec | all | dir } ] [
symlinks { ignore | follow | maintain } ] [
nodelete ] [
nooverwrite ] [
checkmoved [renames] ] [
tempupload ] [
exclude pattern ]... [
ignore pattern ]... [
ascii pattern ]...

Anything after a hash (#) in a line is ignored as a comment. Values may be quoted and characters may be backslash-escaped. For example, to use the exclude pattern *#, use the following line:

exclude *#

Remote Server Options

The server key is used to specify the remote server the site is stored on. This may be either a DNS name or IP address. A connection is made to the default port for the protocol used, or that given by the port key. sitecopy supports the WebDAV or FTP protocols – the protocol key specifies which to use, taking the value of either webdav or ftprespectively. By default, FTP will be used.

The proxy-server and proxy-port keys may be used to specify a proxy server to use. Proxy servers are currently only supported for WebDAV.

If the FTP server does not support passive (PASV) mode, then the key ftp nopasv should be used. To display the message returned by the server on closing the connection, use the ftp showquit option. If the server only supports uploading files in the current working directory, use the key ftp usecwd (possible symptom: “overwrite permission denied”). Note that the remote-directory (keyword remote) must be an absolute path (starting with ‘/’), or usecwd will be ignored.

If the WebDAV server correctly supports the 100-continue expectation, e.g. Apache 1.3.9 and later, the key http expect should be used. Doing so can save some bandwidth and time in an update.

If the WebDAV server supports access via SSL, the key http secure can be used. Doing so will cause the transfers between sitecopy and the host to be performed using an secure, encrypted link. The first time SSL is used to access the server, the user will be prompted to verify the SSL certificate, if it’s not signed by a CA trusted in the system’s CA root bundle.

To authenticate the user with the server, the username and password keys are used. If it exists, the ~/.netrc will be searched for a password if one is not specified. See ftp(1) for the syntax of this file.

Basic and digest authentication are supported for WebDAV. Note that basic authentication must not be used unless the connection is known to be secure.

The full URL that is used to access the site can optionally be specified in the url key. This is used only in flat list mode, so the site URL can be inserted in ‘Recent Changes’ pages. The URL must not have a trailing slash; a valid example i

url http://www.luisfreitas.pt/mysite

If the tempupload option is given, new or changed files are upload with a “.in.” prefix, then moved to the true filename when the upload is complete.

File State

File state is stored in the storage files (~/.sitecopy/*), and is used to discover when a file has been changed. Two methods are supported, and can be selected using the state option, with either parameter: timesize (the default), and checksum.

timesize uses the last-modification date and the size of files to detect when they have changed. checksum uses an MD5 checksum to detect any changes to the file contents.

Note that MD5 checksumming involves reading in the entire file, and is slower than simply using the last-modification date and size. It may be useful for instance if a versioning system is in use which updates the last-modification date on a ‘checkout’, but this doesn’t actually change the file contents.

Safe Mode

Safe Mode is enabled by using the safe key. When enabled, each time a file is uploaded to the server, the modification time of the file as on the server is recorded. Subsequently, whenever this file has been changed locally and is to be uploaded again, the current modification time of the file on the server is retrieved, and compared with the stored value. If these differ, then the remote copy of the file has been altered by a foreign party. A warning message is issued, and your local copy of the file will not be uploaded over it, to prevent losing any changes.

Safe Mode can be used with FTP or WebDAV servers, but if Apache/mod_dav is used, mod_dav 0.9.11 or later is required.

Note Safe mode cannot be used in conjunction with the nooverwrite option (see below).

File Storage Locations

The remote key specifies the root directory of the remote copy of the site. It may be in the form of an absolute pathname, e.g.
remote /www/mysite/
For FTP, the directory may also be specified relative to the login directory, in which case it must be prefixed by “~/”, for example:
remote ~/public_html/

The local key specifies the directory in which the site is stored locally. This may be given relative to your home directory (as given by the environment variable $HOME), again using the “~/” prefix.
local ~/html/foosite/
local /home/fred/html/foosite/
are equivalent, if $HOME is set to “/home/fred”.

For both the local and remote keywords, a trailing slash may be used, but is not required.

File Permissions Handling

File permissions handling is dictated by the permissionskey, which may be given one of three values:

ignore
to ignore file permissions completely (the default),
exec
to mirror the permissions of executable files only,
all
to mirror the permissions of all files.

This can be used, for instance, to ensure the permissions of CGI files are set. The option is currently ignored for WebDAV servers. For FTP servers, a chmod is performed remotely to set the permissions.

To handle directory permissions, the key:
permissions dir
may be used in addition to a permissions key of either exec, local or all. Note that permissions all does not imply permissions dir.

Symbolic Link Handling

Symlinks found in the local site can be either ignored, followed, or maintained. In ‘follow’ mode, the files references by the symlinks will be uploaded in their place. In ‘maintain’ mode, the link will be created remotely as well, see below for more information. The mode used for each site is specified with the symlinks rcfile key, which may take the value of ignore, follow or maintain to select the mode as appropriate.

The default mode is follow, i.e. symbolic links found in the local site are followed.

Symbolic link Maintain Mode

This mode is currently only supported by the WebDAV driver, and will work only with servers which implement WebDAV Advanced Collections, which is a work-in-progress. The target of the link on the server is literally copied from the target of the symlink. Hint: you can use URL’s if you like:
ln -s “”"http://www.luisfreitas.pt”"” somewherehome

In this way, a “302 Redirect” can be easily set up from the client, without having to alter the server configuration.

Deleting and Moving Remote Files

The nodeleteoption may be used to prevent remote files from ever being deleted. This may be useful if you keep large amounts of data on the remote server which you do not need to store locally as well.

If your server does not allow you to upload changed files over existing files, then you can use the nooverwrite option. When this is used, before uploading a changed file, the remote file will be deleted.

If the checkmoved option is used, sitecopy will look for any files which have been moved locally. If any are found, when the remote site is updated, the files will be moved remotely.

If the checkmoved renames option is used, sitecopy will look for any files which have been moved or renamed locally. This option may only be used in conjunction with the state checksum option.

WARNING

If you are not using MD5 checksumming (i.e. the state checksum option) to determine file state, do NOT use the checkmoved option if you tend to hold files in different directories with identical sizes, modification times and names and ever move them about. This seems unlikely, but don’t say you haven’t been warned.

Excluding Files

Files may be excluded from the files list by use of the exclude key, which accepts shell-style globbing patterns. For example, use
exclude *.bak
exclude *~
exclude #*#
to exclude all files which have a .bak extension, end in a tilde (~) character, or which begin and end with a a hash. Don’t forget to quote or escape the value if it includes a hash!

To exclude certain files within an particular directory, simply prefix the pattern with the directory name – including a leading slash. For instance:
exclude /docs/*.m4
exclude /files/*.gz
which will exclude all files with the .m4 extension in the ‘docs’ subdirectory of the site, and all files with the .gz extension in the files subdirectory.

An entire directory can also be excluded – simply use the directory name with no trailing slash. For example
exclude /foo/bar
exclude /where/else
to exclude the ‘foo/bar’ and ‘where/else’ subdirectories of the site.

Exclude patterns are consulted when scanning the local directory, and when scanning the remote site during a –fetch. Any file which matches any exclude pattern is not added to the files list. This means that a file which has already been uploaded by sitecopy, and subsequently matches an exclude pattern will be deleted from the server.

Ignoring Local Changes to Files

The ignore option is used to instruct sitecopy to ignore any local changes made to a file. If a change is made to the contents of an ignored file, this file will not be uploaded by update mode. Ignored files will be created, moved and deleted as normal.

The ignore option is used in the same way as the exclude option.

Note that synchronize mode will overwrite changes made to ignored files.

FTP Transfer Mode

To specify the FTP transfer mode for files, use the ascii key. Any files which are transferred using ASCII mode have CRLF/LF translation performed appropriately. For example, use
ascii *.pl
to upload all files with the .pl extension as ASCII text. This key has no effect with WebDAV (currently).

Return Values

Return values are specified for different operation modes. If multiple sites are specified on the command line, the return value is in respect to the last site given.

Update Mode

-1 … update never even started – configuration problem
0 … update was entirely successful.
1 … update went wrong somewhere
2 … could not connect or login to server

List Mode (default mode of operation)

-1 … could not form list – configuration problem
0 … the remote site does not need updating
1 … the remote site needs updating

Example Rcfile Contents

FTP Server, Simple Usage

Fred’s site is uploaded to the FTP server ‘my.server.com’ and held in the directory ‘public_html’, which is in the login directory. The site is stored locally in the directory /home/fred/html.

site mysite server my.server.com
url http://www.server.com/fred
username fred
password juniper
local /home/fred/html/
remote ~/public_html/

FTP Server, Complex Usage

Here, Freda’s site is uploaded to the FTP server ‘ftp.elsewhere.com’, where it is held in the directory /www/freda/. The local site is stored in /home/freda/sites/elsewhere/

site anothersite server ftp.elsewhere.com
username freda
password blahblahblah
local /home/freda/sites/elsewhere/
remote /www/freda/
# Freda wants files with a .bak extension or a
# trailing ~ to be ignored:
exclude *.bak
exclude *~

WebDAV Server, Simple Usage

This example shows use of a WebDAV server.

site supersite server dav.wow.com
protocol webdav
username pow
password zap
local /home/joe/www/super/
remote /

Files

~/.sitecopyrc Default run control file location.
~/.sitecopy/ Remote site information storage directory
~/.netrc Remote server accounts information

Bugs

Known problems: Fetch + synch modes are NOT reliable for FTP. If you need reliable operation of fetch or synch modes, you shouldn’t be using sitecopy. Try rsync instead.

Please send bug reports and feature requests to <sitecopy@lyra.org> rather than to the author, since the mailing list is archived and can be a useful resource for others.

 

22 Jan

How to create a Google Plus RSS +1

By Luis Freitas

If you want to track someones G+1 clicks via RSS feed, you’ll find no RSS button to subscribe. There is always possibility to make your own parser to extract wanted information from cumbersome html but idea isn’t quite appealing and there should be a better way to solve such simple task.

For these familiar with jQuery, there are server side libraries which take the same approach to html documents, so it’s almost trivial to find and extract the wanted content. I’m using pQuery from cpan.org:

#!/usr/bin/perl

use strict;

use CGI::Carp qw(fatalsToBrowser);
use CGI qw/Vars/;
our %Q = Vars();

my $link = "https://plus.google.com/$Q{id}/plusones#$Q{id}/plusones";
my $items = get_items($link);
my $feed = make_rss(
  $items,
  ftitle => $link,
  link => $link,
);
print "Content-type: application/xml\n\n", $feed->to_string();

sub get_items {

  use pQuery;

  my ($stranica) = @_;
  my $ret = [];

  # find all tables inside div with "g-ol" class
  pQuery($stranica)
    ->find(".g-ol")
    ->find("table")
    ->each(sub {

        my $i = shift;
        my $this = pQuery($_);

        # I want second link with title
        my $pLink = $this->find("a")->get(1);

        push @$ret, {
          link => $pLink->attr("href"),
          title => pQuery($pLink)->text,
          description => $this->get(0)->toHTML,
        };
  });

  return $ret;
}

sub make_rss {

  use XML::FeedPP;
  my ($arr, %arg) = @_;

  $arg{ftitle} ||= "Feed title";

  my $feed = XML::FeedPP::Atom->new();
  $feed->title($arg{ftitle});
  $feed->title($arg{link}) if $arg{link};

  $feed->add_item(
    title => $_->{title},
    link  => $_->{link},
    description => $_->{description},
  )
  for @$arr;

  return $feed;
}
17 Aug

High performance Innodb Mysql server configuration

By Luis Freitas | 1 comment

1

This week i had a urgent job about recovering a huge Innodb Mysql database with 2.4 million records and 2GB in size and with 160GB on 107 418 Jpeg files with IPTC and EXIF info associated with. I had to prepare and configure a new server with this hardware:

  • 4 GB RAM
  • 64 Bit system
  • 3 GHz Core 2 processor

After load testing i found the perfect tuned my.cnf for this database with hight volume entries daily:

File my.cnf

# MySQL configuration
# InnoDB database with 3 MyISAM tables linked

port		= 3306
socket		= /var/run/mysql/mysql.sock

# MySQL server
[mysqld]
port		= 3306
socket		= /var/run/mysql/mysql.sock
datadir	= /var/lib/mysql
skip-locking
key_buffer_size = 16M
max_allowed_packet = 2M
table_open_cache = 128
sort_buffer_size = 1024K
net_buffer_length = 8K
read_buffer_size = 512K
read_rnd_buffer_size = 1024K
myisam_sort_buffer_size = 8M
query_cache_size = 16M

# Added values after load testing
thread_cache_size = 4
tmp_table_size = 128M
max_heap_table_size = 128M
table_cache = 512
join_buffer_size = 512

# Replication Master Server (default)
# binary logging is required for replication
log-bin=mysql-bin

# binary logging format
binlog_format=mixed

# required unique id between 1 and 2^32 - 1
# defaults to 1 if master-host is not set
# but will not function as a master if omitted
server-id	= 1

# Innodb specifics
innodb_data_home_dir = /var/lib/mysql/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /var/lib/mysql/
# 50 - 80 % of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 2048M
# innodb_additional_mem_pool_size generally not needed after tests
innodb_additional_mem_pool_size = 8M
# log_file_size to 25 % of buffer pool size
innodb_log_file_size = 512M
innodb_log_buffer_size = 16M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50
# innodb_file_per_table to avoid defragmentation but when optimizing will lock tables
innodb_file_per_table = 1

# The safe_mysqld script
[safe_mysqld]
log-error	= /var/log/mysql/mysqld.log
socket		= /var/run/mysql/mysql.sock

!include_dir /etc/mysql

[mysqldump]
socket		= /var/run/mysql/mysql.sock
quick
max_allowed_packet = 32M

[mysql]
no-auto-rehash
# Remove the next comment character if you are not familiar with SQL
#safe-updates

[myisamchk]
key_buffer_size = 60M
sort_buffer_size = 60M
read_buffer = 8M
write_buffer = 8M

[mysqlhotcopy]
interactive-timeout

[mysqld_multi]
mysqld     = /usr/bin/mysqld_safe
mysqladmin = /usr/bin/mysqladmin
log        = /var/log/mysqld_multi.log

# Please give new suggestions if you think you can optimize performance even more. :) 
22 Jun

How to install KDE4 applications on Windows

By Luis Freitas

KDE (K Desktop Environment) is one of the most common desktop environments used under Linux and Unix systems. Apparently it has reached version 4. KDE4 is based on Qt4 which is also released under the GPL for Windows and Mac OS X. Therefore KDE 4 applications can be compiled and run natively on these operating systems as well. The centerpiece is a redesigned desktop and panels collectively called Plasma which replaces Kicker, KDesktop, and SuperKaramba by integrating their functionality into one piece of technology.

In this tutorial I will show you how to install and run KDE4 applications natively on Windows. Windows 2000, XP, and Vista are supported. Of course since this project is still beta some applications may not work correctly or even may not work at all. However it is a very promising project that will allow Windows users run Linux applications. KOffice, Kopete, Amarok, Ktorrent, Konquerror, KDevelop, K3b, Kmail, Dolphin are only some of them. Trust me, the list of Qt applications is very big!

To get started click on the following link to download KDE Installer for Windows . Here click on kdewin-installer-gui-0.9.3-1.exe (1) (or the latest one) and select Save.

When the download completes click the Run button. You may get a Security Warning. Just click again the Run button.

KDE installer should start. The installation process is quite simple as you will see. If this is the first time you run this wizard click Next (1). However you can use this wizard at any time to add, remove or upgrade KDE packages. In that case check the box next to ‘go directly to the download server page, skip basic settings’ (2).

Now you must choose a directory in which all applications packages will be stored. Here i have chosen C:\KDE4. Click Next.

In the next screen you must select the mode the installer should work. Since you just want to run KDE applications leave the default ‘End User‘ option selected. Click Next.

Next you must select the directory where downloaded packages are stored. Here I have again chosen C:\KDE4.

Configuring Internet Settings is next. The default option ‘I have direct connection to the Internet’ should be ok. If it isn’t I guess you already know what the other options are and which one to use.

After that you must select the server from where you want to download all packages. You ‘d better choose the location which0 is closer to you.

Currently 4.1.0 stable release is available and it’s the one we want. So just click Next.

Next select the packages you want to be installed. For start you can select all of them and play a little. There are many games included in the kdegames package. ;) If you prefer another language than English don’t forget to select the appropriate language package also. Besides at any time you can install/uninstall any package simply by running this installer again.

The next screen shows you additional packages that must be installed as dependencies. Just click Next.

Now sit back and wait till the installation process completes!

In case you have chosen oggvorbis for installation a window will pop-up. Here click the I Agree button.

At some point Visual C++ 2005 Redistributable will be automatically installed. Depending on your Internet Connection the installation will complete sooner or later.

Now you must configure Windows environment variables in order to run KDE applications. Right click on My Computer icon on your Desktop. Go to the Advanced tab and click on the Environment Variables button. Click New (1) and set KDEDIRS as Variable name (2) and C:\KDE4 as Variable value (3). Click Ok.

Under System Variables find the Path variable (1) and click the Edit button (2).

A new window will pop up. Here at the end of the Variable value add

%KDEDIRS%\lib;%KDEDIRS%\bin;
And click Ok.

Now under Start menu -> KDE 4.1.00 release you can find all KDE4 applications and utilities. In the screenshot below you can see KolourPaint, Amarok 2, Dolphin, Kopete and Kcalc all running natively under Windows XP!

Set oxygen style for widgets

The default KDE widget style on Windows is the native one. However you can use KDE4 Oxygen style theme. Go to C:\Documents and Settings\your_user\.kde\share\config and select to open kdeglobals with Wordpad.

In that file paste the following lines:

[General]
widgetStyle=oxygen
If [General] section already exists just paste widgetStyle=oxygen below it. Save and exit. Newly started applications should be displayed with Oxygen style now.

Change the mouse to Double Click

To change the mouse to use double click:

Add a new section with a line:

[KDE]
SingleClick=false
Newly started applications (Dolphin and Konqueror) should use double click now.

Change locale and country settings

To change locale setting:

Add a new section with the line:

[Locale]
Country=**
Language=**
Replace ** with your lowercase alpha-2 country code . Of course, during the installation process you should have installed your language localization package.

Change native KDE file dialogs

To choose native or KDE file dialog:

Add a new section with the lines:

[KFileDialog Settings]
Native=false
Either set Native to true or false.

Conclusion

KDE on Windows seems a very good and very promising effort on nativelly porting KDE applications on MS Windows. Maybe this will become a reason for more Windows users to try the real thing, which is a Linux distribution of course.

Telephone

+00 (351) 917 646 322

Contact Us




All fields are required.


Sending...

Close contact form