Stop Using parse_url(): PHP 8.5's URI Extension Explained
parse_url() has been silently returning incomplete arrays and swallowing garbage input since PHP 4. PHP 8.5's new URI extension fixes that — immutable objects, typed getters, and actual exceptions when something goes wrong.
Meet the PHP 8.5 URI Extension
The extension ships two classes, bundled and always available — no composer require, no PECL:
Uri\Rfc3986\Uri— strict URI parsing following RFC 3986. Use this for backend services, custom schemes (ftp://,mailto:, your own), and anywhere that needs hard validation.Uri\WhatWg\Url— browser-compatible URL parsing following the WHATWG URL standard. Use this when you need behaviour that matches Chrome or Firefox — Unicode hostnames, lenient normalisation, percent-encoding handled for you.
Both classes are immutable. Every modification returns a new instance; the original is untouched. It is the same design PHP developers already know from PSR-7 HTTP messages, and the same principle behind building immutable value objects with PHP readonly classes.
Parsing and Reading URI Components
The old way with parse_url():
$parts = parse_url('https://richdynamix.com/articles?tag=php#top');
echo $parts['host']; // richdynamix.com
echo $parts['fragment']; // top
echo $parts['port']; // PHP Notice: Undefined index: port
The array returned by parse_url() only contains keys that are present in the URL. Miss an isset() check and you get a notice at runtime — no exception, no early failure.
With the PHP 8.5 URI extension:
use Uri\Rfc3986\Uri;
$uri = new Uri('https://richdynamix.com/articles?tag=php#top');
echo $uri->getScheme(); // https
echo $uri->getHost(); // richdynamix.com
echo $uri->getPath(); // /articles
echo $uri->getQuery(); // tag=php
echo $uri->getFragment(); // top
var_dump($uri->getPort()); // NULL — absent components return null, not an undefined key
No undefined index notices. Every getter is typed and returns null when that component is absent.
The RFC 3986 class also exposes raw (percent-encoded) variants for components where encoding matters:
$uri = new Uri('https://example.com/caf%C3%A9/menu');
echo $uri->getPath(); // /café/menu (decoded)
echo $uri->getRawPath(); // /caf%C3%A9/menu (original encoding preserved)
Use the raw getters when you need to forward the URI elsewhere without altering its encoding.
Modifying URIs Immutably
Each withX() method returns a new Uri instance with that component replaced. Chain them freely:
use Uri\Rfc3986\Uri;
$original = new Uri('https://richdynamix.com/articles?tag=php');
$updated = $original
->withQuery('tag=laravel&sort=recent')
->withFragment('results');
echo $original->toString(); // https://richdynamix.com/articles?tag=php
echo $updated->toString(); // https://richdynamix.com/articles?tag=laravel&sort=recent#results
toString() returns a normalised representation. If you need the URI exactly as you gave it, call toRawString() instead.
Available modifiers on Uri\Rfc3986\Uri:
$uri->withScheme(string $scheme): static
$uri->withHost(string $host): static
$uri->withPort(?int $port): static
$uri->withPath(string $path): static
$uri->withQuery(?string $query): static
$uri->withFragment(?string $fragment): static
$uri->withUserInfo(?string $user, ?string $password = null): static
This is the same immutable chaining pattern PHP 8.5 introduces elsewhere — if you haven't yet seen how clone with updates readonly objects without boilerplate, the two features complement each other nicely.
RFC 3986 vs WHATWG: Which to Use?
Reach for Uri\Rfc3986\Uri when:
- You're building backend APIs or processing URIs programmatically
- Your URIs use non-HTTP schemes
- You want strict validation — invalid input must throw, full stop
Reach for Uri\WhatWg\Url when:
- You're processing URLs submitted by users or browsers
- You need IDNA/Unicode hostname support
- You want normalisation that matches browser behaviour
The WHATWG class has its own serialisation methods to match the spec:
use Uri\WhatWg\Url;
$url = new Url('HTTPS://Example.COM/Path');
echo $url->toAsciiString(); // https://example.com/Path (lowercase, machine-readable)
echo $url->toUnicodeString(); // https://example.com/Path (display-friendly, Unicode hosts decoded)
The RFC 3986 class uses toString() / toRawString() — toAsciiString() does not exist on that class.
The pattern is similar to type safety in domain modelling: just as replacing string constants with PHP backed enums eliminates the ambiguity of raw strings, choosing the right URI class encodes the spec you're working against into your type system.
PHP 8.5 URI Extension Error Handling
parse_url() returns false on severely malformed input — but stays quiet for a lot of edge cases that return incomplete arrays:
// parse_url() says this is fine
$parts = parse_url('not://a valid uri ///');
var_dump($parts);
// array(2) { ["scheme"]=> string(3) "not" ["path"]=> string(16) "//a valid uri //" }
The PHP 8.5 URI extension throws. Use the constructor when you want an exception on bad input:
use Uri\Rfc3986\Uri;
use Uri\InvalidUriException;
try {
$uri = new Uri($userInput);
} catch (InvalidUriException $e) {
// $userInput was not a valid RFC 3986 URI
return response()->json(['error' => 'Invalid URL provided'], 422);
}
Or use the static parse() factory if you prefer a null-return style:
use Uri\Rfc3986\Uri;
$uri = Uri::parse($userInput);
if ($uri === null) {
// invalid input — handle without a try/catch
return null;
}
echo $uri->getHost();
The WHATWG version adds a third argument for collecting "soft errors" — validation warnings that do not prevent parsing:
use Uri\WhatWg\Url;
$errors = [];
$url = Url::parse(' https://example.com ', null, $errors);
// $url is a valid Url instance — WHATWG parsed it despite the leading/trailing whitespace
// $errors contains UrlValidationError objects describing the soft failures
Gotchas and Edge Cases
No fromString() method — some early blog posts mention Uri::fromString(). It does not exist. Use new Uri($string) (throws on failure) or Uri::parse($string) (returns null on failure).
Exception class — the thrown exception is Uri\InvalidUriException, not \ValueError. The exception hierarchy is Uri\UriException → Uri\InvalidUriException → Uri\WhatWg\InvalidUrlException.
RFC 3986 does not support IDNA — passing a Unicode hostname to Uri\Rfc3986\Uri accepts it as a raw string without Punycode conversion. Use Uri\WhatWg\Url for user-facing URLs with Unicode hosts.
Fragment equality — both classes ignore the fragment by default when comparing URIs. https://example.com#foo equals https://example.com. Pass UriComparisonMode::IncludeFragment to equals() for strict matching.
WHATWG requires a scheme — Uri\WhatWg\Url throws on schemeless input. Uri\Rfc3986\Uri is more permissive and accepts relative URIs.
PHP 8.5+ only — the extension is always available from PHP 8.5 with no install step. Until your app (and any packages you depend on) requires PHP 8.5 as a minimum, you can't use it in shared library code. Before upgrading, audit your PHP dependencies to check the minimum PHP versions your key packages already enforce.
Wrapping Up
Swap parse_url() for new Uri\Rfc3986\Uri($url) or Uri::parse($url) in any PHP 8.5 codebase. You get typed components, immutable modification, and a real exception on bad input. Use Uri\WhatWg\Url when browser compatibility matters.
PHP 8.5 is shipping several ergonomic improvements at once — the |> pipe operator for cleaner function chaining is another worth picking up in the same upgrade window.
Steven is a software engineer with a passion for building scalable web applications. He enjoys sharing his knowledge through articles and tutorials.