Splitting CamelCase With Regular Expressions

On my new project, I needed to split a method name up into its constituent parts.

For Ruby programmers, this is easy: just split on underscores. For us .NET programmers, we need something a little fancier since we like to SquashOurMethodNamesTogetherLikeThis.

Here’s a little regular expression that can do exactly that:

(?<!^)(?=[A-Z])

That basically says to match right before every capital letter unless the capital letter is at the beginning of the string.

You can use the Regex.Split method to do the actual work of turning the method name into a string array:

Regex splitter = new Regex(@"(?<!^)(?=[A-Z])");
string[] words = splitter.Split(methodName);

My friend, Michael Kennedy, demoed in class recently how to use LINQPad to test regular expressions. It came in really handy while working on this:

linqpad-with-regex

I really need to click that “Activate Autocompletion” button so I can give Joseph Albahari the money he deserves for creating such a useful tool.

By the way, I used to think that “camel case” was for words like “camelCase” and “pascal case” was for words like “PascalCase”, but Wikipedia doesn’t make that distinction.

Comments (3)

  1. James Smith wrote::

    Your thinking about camelCasing and PascalCasing is correct. If Wikipedia (the mother of non-authoritative sources) contradicts that, then, QED, it’s incorrect. See Microsoft for official definitions (camelCase/PascalCasing).

    Please do not corrupt the world by waffling. camelCase is camelCase and PascalCase is PascalCase.

    Wednesday, November 4, 2009 at 7:48 am #
  2. jmini wrote::

    I just tried your regex… it works perfectly in Java…

    String s = “loremIpsum”;
    words = s.split(“(? V / A / L / U / E
    2) eclipseRCPExt -> eclipse / R / C / P / Ext

    To my mind, the result shoud be:
    1) VALUE
    2) eclipse / RCP / Ext

    In other words, given n uppercase chars, if there are follwed by lower case, it should be (n-1 chars) / (n-th char + lower chars)

    if there are not at the end: (n chars).

    If you have any idea with a regex…

    Thursday, September 29, 2011 at 12:09 am #
  3. jmini wrote::

    I got an answer on stackoverflow

    http://stackoverflow.com/questions/7593969/regex-to-split-camelcase-or-titlecase-advanced/7594052#7594052

    There is a possible extension to the regex.

    Thursday, September 29, 2011 at 12:55 am #

Trackbacks/Pingbacks (2)

  1. RegEx to split camelCase or TitleCase (advanced) | Gravity Layouts on Thursday, September 29, 2011 at 8:46 am

    [...] found a brilliant RegEx to extract the part of a camelCase or TitleCase [...]

  2. RegEx to split camelCase or TitleCase (advanced) on Monday, October 10, 2011 at 4:17 pm

    [...] found a brilliant RegEx to extract the part of a camelCase or TitleCase [...]