Unredir
This script scans the pages of your website, tests each URL, and when redirected, replaces the URL with the new address.
This is suitable for sites that switch from HTTP to HTTPS, this updates the links, both on the site itself and on all other linked sites.
It also displays broken links and for static sites replaces a link testing tool such as Link Checker on this site.
The code
The program uses the DOMDocument class of PHP to find links in <a> tags or images. But it also uses the file_get_contents function to load the file as plain text.
A routine calls Curl to test if a link is redirected, then to find the final redirect address.
The str_replace function is used to replace redirected URLs (not setAttribute). Then we save the contents with file_put_contents.
Using these alternate functions avoids going through the saveHTMLFile method that tries to reconstruct HTML content before saving the file. Because then tags are added while they can already exist in a php file included.
PHP code to check for redirection:
function redirected($url)
{
$hcurl=curl_init();
curl_setopt($hcurl, CURLOPT_CONNECTTIMEOUT, 300);
curl_setopt($hcurl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($hcurl, CURLOPT_VERBOSE, false);
curl_setopt($hcurl, CURLOPT_URL, $url);
curl_setopt($hcurl, CURLOPT_HEADER, true);
curl_setopt($hcurl, CURLOPT_NOBODY, true);
curl_setopt($hcurl, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($hcurl, CURLOPT_SSL_VERIFYPEER, false);
$headers = curl_exec($hcurl);
$code = curl_getinfo($hcurl, CURLINFO_HTTP_CODE);
if($code!=301)
{
curl_close($hcurl);
return "";
}
curl_setopt($hcurl, CURLOPT_FOLLOWLOCATION, true);
$headers = curl_exec($hcurl);
$newurl = curl_getinfo($hcurl, CURLINFO_EFFECTIVE_URL);
$code = curl_getinfo($hcurl, CURLINFO_HTTP_CODE);
curl_close($hcurl);
if($code!=200)
{
return "";
}
return $newurl;
}
Manual
Open the command line console, go to the directory containing the pages of the site you want to update. Type:
php c:/unredir/unredir.php [options]
Replaces the directory above with the one where you installed unredir.
Two options are possible:
-t : Test the result without changing the files.
-v: Verbose, display all scanned pages.
Download
Versions
- March 24, 2021. Added count of broken links found.
See also ...
From HTTP to HTTPS. This script replaces links from http to https for a given domain. It is complementary to it insofar as it also changes the links in the text. But it only takes redirections into account for a specified domain.