PHP Domain Parser is a Public Suffix List based domain parser implemented in PHP.
While there are plenty of excellent URL parsers and builders available, there are very few projects that can accurately parse a domaine into its component subdomain, registrable domain, and public suffix parts.
Consider the domain www.pref.okinawa.jp. In this domain, the public suffix portion is okinawa.jp, the registrable domain is pref.okinawa.jp, and the subdomain is www. You can't regex that.
PHP Domain Parser is compliant around:
- accurate Public Suffix List based parsing.
- accurate Root Zone Database parsing.
$ composer require jeremykendall/php-domain-parser
You need:
- PHP >= 7.4 but the latest stable version of PHP is recommended
- the
intl
extension
The first objective of the library is to use the Public Suffix List to easily resolve a
domain as a Pdp\ResolvedDomain
object using the following methods:
<?php
use Pdp\Rules;
$rules = Rules::fromPath('/path/to/cache/public-suffix-list.dat');
$resolvedDomain = $rules->resolve('www.PreF.OkiNawA.jP');
echo $resolvedDomain->getDomain(); //display 'www.pref.okinawa.jp';
echo $resolvedDomain->getPublicSuffix(); //display 'okinawa.jp';
echo $resolvedDomain->getSecondLevelDomain(); //display 'pref';
echo $resolvedDomain->getRegistrableDomain(); //display 'pref.okinawa.jp';
echo $resolvedDomain->getSubDomain(); //display 'www';
In case of an error an exception which extends Pdp\ExceptionInterface
is thrown.
While the Public Suffix List is a community based list, the package provides access to the Top Level domain information given by the IANA website to always resolve top domain against all registered TLD even the new ones.
use Pdp\TopLevelDomains;
$iana = TopLevelDomains::fromPath('/path/to/cache/tlds-alpha-by-domain.txt');
$resolvedDomain = $iana->resolve('www.PreF.OkiNawA.jP');
echo $resolvedDomain->getDomain(); //display 'www.pref.okinawa.jp';
echo $resolvedDomain->getPublicSuffix(); //display 'jp';
echo $resolvedDomain->getSecondLevelDomain(); //display 'okinawa';
echo $resolvedDomain->getRegistrableDomain(); //display 'okinawa.jp';
echo $resolvedDomain->getSubDomain(); //display 'www.pref';
In case of an error an exception which extends Pdp\ExceptionInterface
is thrown.
WARNING:
You should never use the library this way in production, without, at least, a caching mechanism to reduce PSL downloads.
Using the Public Suffix List to determine what is a valid domain name and what isn't is dangerous, particularly in these days when new gTLDs are arriving at a rapid pace. The DNS is the proper source for this information.
If you are looking to know the validity of a Top Level Domain, the IANA Root Zone Database is the proper source for this information.
If you must use this library for any of the above purposes, please consider integrating an update mechanism into your software.
Depending on your software the mechanism to store your database may differ, nevertheless, the library comes bundle with a optional service which enables resolving domain name without the constant network overhead of continuously downloading the remote databases.
The Pdp\Storage\PsrStorageFactory
enables returning storage instances that retrieve, convert and cache the Public Suffix List as well as the IANA Root Zone Database using
standard interfaces published by the PHP-FIG to improve its interoperability with any modern PHP codebase.
To work as intended, the Pdp\Storage\PsrStorageFactory
constructor requires:
When creating a new storage instance you will require:
- a
$cachePrefix
argument to optionally add a prefix to your cache index, default to the empty string'''
; - a
$ttl
argument if you need to set the default$ttl
, default tonull
to use the underlying caching default TTL;
The $ttl
argument can be:
- an
int
representing time in second (see PSR-16); - a
DateInterval
object (see PSR-16); - a
DateTimeInterface
object representing the date and time when the item should expire;
However, the package no longer provides any implementation of such interfaces are they are many robust implementations that can easily be found on packagist.org.
THIS IS THE RECOMMENDED WAY OF USING THE LIBRARY
For the purpose of this example we used:
- Guzzle as a PSR-18 implementation HTTP client
- The Symfony Cache Component to use a PSR-16 cache implementation
You could easily use other packages as long as they implement the required PSR interfaces.
<?php
use GuzzleHttp\Client;
use GuzzleHttp\Psr7\Request;
use Pdp\Storage\PsrStorageFactory;
use Psr\Http\Message\RequestFactoryInterface;
use Psr\Http\Message\RequestInterface;
use Symfony\Component\Cache\Adapter\FilesystemAdapter;
use Symfony\Component\Cache\Psr16Cache;
$cache = new Psr16Cache(new FilesystemAdapter('pdp', 3600, __DIR__.'/data'));
$client = new Client();
$requestFactory = new class implements RequestFactoryInterface {
public function createRequest(string $method, $uri): RequestInterface
{
return new Request($method, $uri);
}
};
$cachePrefix = 'pdp_';
$cacheTtl = new DateInterval('P1D');
$factory = new PsrStorageFactory($cache, $client, $requestFactory);
$pslStorage = $factory->createPublicSuffixListStorage($cachePrefix, $cacheTtl);
$rzdStorage = $factory->createRootZoneDatabaseStorage($cachePrefix, $cacheTtl);
$rules = $pslStorage->get(PsrStorageFactory::URL_PSL);
$tldDomains = $rzdStorage->get(PsrStorageFactory::URL_RZD);
It is important to always have an up to date Public Suffix List and Root Zone Database. This library no longer provide an out of the box script to do so as implementing such a job heavily depends on your application setup.
Please see CHANGELOG for more information about what has been changed since version 5.0.0 was released.
Contributions are welcome and will be fully credited. Please see CONTRIBUTING for details.
pdp-domain-parser
has:
- a PHPUnit test suite
- a coding style compliance test suite using PHP CS Fixer.
- a code analysis compliance test suite using PHPStan.
To run the tests, run the following command from the project folder.
$ composer test
If you discover any security related issues, please email [email protected] instead of using the issue tracker.
The MIT License (MIT). Please see License File for more information.
Portions of the Pdp\Converter
and Pdp\Rules
are derivative works of the PHP
registered-domain-libs.
Those parts of this codebase are heavily commented, and I've included a copy of
the Apache Software Foundation License 2.0 in this project.