Skip to content

Instantly share code, notes, and snippets.

@jhorsman
Last active March 29, 2024 05:25
Show Gist options
  • Star 62 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save jhorsman/62eeea161a13b80e39f5249281e17c39 to your computer and use it in GitHub Desktop.
Save jhorsman/62eeea161a13b80e39f5249281e17c39 to your computer and use it in GitHub Desktop.
Semantic versioning regex
@Tschebbischeff
Copy link

@rverst
I don't know if you did this on all the links, but at least for mine you are missing the leading ^ and trailing $ which are important to match on a single line (hence it doesn't fully match the patch version in your provided link.

On your test set using my actual RegEx I can not find any missing matches on valid strings.
On invalid strings I find significantly less errors than you with my actual RegEx.
The only error is that I allow leading zeroes in the pre-release versions, which is indeed illegal and correctly handled by the now existing official regex.

Here is a link to my test with the ^ and $ : https://regex101.com/r/JOKR70/1
(I have added 1.2.3-0123+meta and 1.2.3-123.01234 as invalid test cases)

This is the test with the official RegEx: https://regex101.com/r/TefKLN/1

Semantically, I can not find any other mistakes I made, when comparing to the official RegEx.

I will also update my answer above with the suggestion to use the official RegEx, since people finding this thread may not be reading a big discussion.

@rverst
Copy link

rverst commented Sep 23, 2019

@Tschebbischeff I did, for the tester at http://regexstorm.net - since it seemed not to work at all (at least for me) with the tokens for start- or end-of-line.
You're right, your regex matches the valid strings. Not on regexstorm.net but I wouldn't recommend this page for validating regex either.

I'm sorry if I misinterpreted some results of my (admittedly fast and simple) test, but I just couldn't leave @johnwx's statement:

"All of those regex work and have been tested to work with examples"

like that.

My main goal was to make people who find this thread aware of the official version of the regex. And the fact that sometimes it's just not enough to have only five test cases or so.

@jmatsushita
Copy link

Here's a version including semver contraints https://regex101.com/r/Ly7O1x/196

@Dentrax
Copy link

Dentrax commented May 7, 2020

@jmatsushita @jhorsman

1.0.0-alpha- passes the unit test, but it should not.

@Tschebbischeff
Copy link

Tschebbischeff commented May 19, 2020

@Dentrax
1.0.0-alpha- is in fact a valid semantic version according to (9):

A pre-release version MAY be denoted by appending a hyphen and a series of dot separated identifiers immediately following the patch version. Identifiers MUST comprise only ASCII alphanumerics and hyphen [0-9A-Za-z-]. Identifiers MUST NOT be empty. Numeric identifiers MUST NOT include leading zeroes. Pre-release versions have a lower precedence than the associated normal version. A pre-release version indicates that the version is unstable and might not satisfy the intended compatibility requirements as denoted by its associated normal version. Examples: 1.0.0-alpha, 1.0.0-alpha.1, 1.0.0-0.3.7, 1.0.0-x.7.z.92.

Hyphens are allowed to be a part of the pre-release identifiers.
Only the first hyphen in a semantic version string denotes the beginning of the following pre-relase identifiers.
The pre-release information is terminated either by the end of the string or a + (which denotes the beginning of build metadata).

I.e. your example resolves to these values:

major: 1
minor: 0
patch: 0
pre-release: ["alpha-"]
build-metadata: []

(Also there is now an officially recommended RegEx in the FAQ section on semver.org that you can use :) )

@mathomp4
Copy link

mathomp4 commented Feb 2, 2022

Here's a fun question for the gurus here: Does anyone have a good GitHub acceptable SemVer regex?

I got into the beta for protected tags and tried the official SemVer regexes but both seem to be too complex for GitHub regexer.

@Tschebbischeff
Copy link

Tschebbischeff commented Feb 2, 2022

@mathomp4
It is impossible to create a pattern for semver strings with the glob-style pattern matching emmaviolet@github mentioned they use.
I'm sorry 😔

More detail if you want:

A semver string's major version part alone allows for a theoretically infinite string of numerical characters. Let's try to get the equivalent of the regex [0-9]* only.

* allows any string (excl. / maybe, then ** specifically would allow / inside).

It is limitable with a constant prefix, infix or suffix, but there is no way to limit the "type of character" it allows at the beginning and/ or end of the string.

Now * and ** are the only two special characters that allow to match more than one character.
There is no way to make ? or [set] match against more than one character and {a,b} (if even available here) is based on a and b being patterns. Those patterns might allow infinite strings by using * inside, but do not allow limiting the type of character for the same reason as above.

Hence, it's not possible to match "an infinite string comprised of only specific characters" in glob-style 🥲

@mathomp4
Copy link

mathomp4 commented Feb 2, 2022

@Tschebbischeff Yeah, I stared at it for a while but I figured "If I can't do an ls for that pattern, I can't do this". Oh well, at least I can block off v*...and hope no bosses want a v-tag! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment