Just listen to Alex

July 12, 2008

Batch downloading cover art with PHP and Google Image Search

Filed under: programming — Tags: — bosmeeuw @ 2:58 pm

Do you happen to have come by a large collection of MP3’s, ordered in folders by album? Are they named something like “Artist Name – Album Name”? Would you like to download the cover for each album so you can browse them like this in explorer?

Screenshot

Yes? Then put the wad of PHP code below into a file (named get_covers.php or whatever you like) in the root folder of your MP3’s, and execute it using PHP. This script will do a google image search for every folder which doesn’t have a folder.jpg and download the resulting image into folder.jpg, providing a nice thumbnail for Windows Explorer. For 99% of my ~500 albums, this worked perfectly.

This script downloads arbitrary image results from Google to your hard drive, directly from the resulting websites, without any attempt at filtering out harmful results. The results may include corrupt or virus infected images and pictures of naked ladies, which can seriously harm your computer.

<?php

$folders = glob('*');

$ctx = stream_context_create(array(
    'http' => array(
        'timeout' => 10
        )
    )
); 

foreach($folders as $folder) {
	if(!is_dir($folder)) {
		echo "Skipping {$folder}\n";
		continue;
	}

	if(is_file($folder."/folder.jpg")) {
		echo "Already have cover for {$folder}\n";
		continue;
	}

	$album = str_replace('_',' ',$folder);

	echo "Checking {$album}.. ";

	$googleUrl = "http://images.google.com/images?um=1&hl=en&safe=off&imgsz=medium&q=".urlencode($album);

	$contents = file_get_contents($googleUrl,0,$ctx);
	
	if(preg_match_all('/imgurl\\\\x3d(.*?)\\\\x26/i',$contents,$matches)) {
		foreach($matches[1] as $image) {
			if($imageContents = file_get_contents($image,0,$ctx)) {
				file_put_contents("{$folder}/folder.jpg",$imageContents);
				echo "Found image {$image}\n";
				break;
			}
		}
	}
	else {
		echo "nothing found!\n";
	}
}
?>
Advertisements

Blog at WordPress.com.