URI Swiss Army Knife Utilities

You can install this via the published NPM package:

npm i beam-uri

URL Validation

A complete definition of what constitutes a valid URL can be found in RFC 3986 and RFC 3987. The short version is that a valid URL must, at minimum, consist of a scheme (https://, http://ftp://, http://gopher://) and a host name. If it does not, validation should fail, and the browser should throw an error.

A URL string is a structured string containing multiple meaningful components. When parsed, a URL object is returned containing properties for each of these components.

The Node.js url module provides two APIs for working with URLs: a legacy API that is Node.js specific, and a newer API that implements the same WHATWG URL Standard used by web browsers.

┌────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                              href                                              │
├──────────┬──┬─────────────────────┬────────────────────────┬───────────────────────────┬───────┤
│ protocol │  │        auth         │          host          │           path            │ hash  │
│          │  │                     ├─────────────────┬──────┼──────────┬────────────────┤       │
│          │  │                     │    hostname     │ port │ pathname │     search     │       │
│          │  │                     │                 │      │          ├─┬──────────────┤       │
│          │  │                     │                 │      │          │ │    query     │       │
"  https:   //    user   :   pass   @ sub.example.com : 8080   /p/a/t/h  ?  query=string   #hash "
│          │  │          │          │    hostname     │ port │          │                │       │
│          │  │          │          ├─────────────────┴──────┤          │                │       │
│ protocol │  │ username │ password │          host          │          │                │       │
├──────────┴──┼──────────┴──────────┼────────────────────────┤          │                │       │
│   origin    │                     │         origin         │ pathname │     search     │ hash  │
├─────────────┴─────────────────────┴────────────────────────┴──────────┴────────────────┴───────┤
│                                              href                                              │
└────────────────────────────────────────────────────────────────────────────────────────────────┘
(all spaces in the "" line should be ignored — they are purely for formatting)

getDomain(url) ⇒ `String`

We can extract the domain from a url by leveraging our method for parsing the hostname. Since the above getHostName() method gets us very close to a solution, we just need to remove the sub-domain and clean-up special cases (such as .co.uk)

Returns: String - the extracted domain

getDomainName(url) ⇒ `String`

Extract the main domain without the .domain notation

Returns: String - the extracted domain

getHostName(url) ⇒ `String`

Extracting the hostname from a url is generally easier than parsing the domain. The hostname of a url consists of the entire domain plus sub-domain. We can easily parse this with a regular expression, which looks for everything to the left of the double-slash in a url. We remove the “www” (and associated integers e.g. www2), as this is typically not needed when parsing the hostname from a url

Returns: String - the extracted hostname

getLinkType(source) ⇒ `String`

Identify if the link is for a social website

Kind: global function

isValidIP(ip) ⇒ `Boolean`

Validate if a passed string is a valid IP according to: http://jsfiddle.net/AJEzQ/

Returns: Boolean - indication if the string is valid URI or not

isValidURI(url) ⇒ `Boolean`

Validate if a passed string is a valid URI according to: https://gist.github.com/dperini/729294

Returns: Boolean - indication if the string is valid URI or not

normalize(url) ⇒ `String`

normalize and canonicalise urls including data URL The function first normalize the url by performing various steps from lower-casing to encoding The function then strips any url trackers and paddings in the url The function tries to canonicalise the url if possible based on configurations depending on the domain name

Returns: String - the normalized and canonical url

removeURLTracking(url) ⇒ `String`

removes tracking query parameters from the url

Returns: String - strippedUrl the URL address after tracker stripping

parse(url) ⇒ `Object`

Parses a valid URI into its subparts

Returns: Object - the parsed url

References

In search of the perfect URL validation regex
uri-js: An RFC 3986 compliant, scheme extendable URI parsing/validating/normalizing/resolving library for JavaScript
regex-weburl: Regular Expression for URL validation
parse-domain: Splits a URL into sub-domain, domain and the top-level domain. Provides TypeScript typings
normalize-url: Normalize a URL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

URI Swiss Army Knife Utilities

URL Validation

getDomain(url) ⇒ `String`

getDomainName(url) ⇒ `String`

getHostName(url) ⇒ `String`

getLinkType(source) ⇒ `String`

isValidIP(ip) ⇒ `Boolean`

isValidURI(url) ⇒ `Boolean`

normalize(url) ⇒ `String`

removeURLTracking(url) ⇒ `String`

parse(url) ⇒ `Object`

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

URI Swiss Army Knife Utilities

URL Validation

getDomain(url) ⇒ String

getDomainName(url) ⇒ String

getHostName(url) ⇒ String

getLinkType(source) ⇒ String

isValidIP(ip) ⇒ Boolean

isValidURI(url) ⇒ Boolean

normalize(url) ⇒ String

removeURLTracking(url) ⇒ String

parse(url) ⇒ Object

References

getDomain(url) ⇒ `String`

getDomainName(url) ⇒ `String`

getHostName(url) ⇒ `String`

getLinkType(source) ⇒ `String`

isValidIP(ip) ⇒ `Boolean`

isValidURI(url) ⇒ `Boolean`

normalize(url) ⇒ `String`

removeURLTracking(url) ⇒ `String`

parse(url) ⇒ `Object`