the blog for developers

Fluent Interface for Regular Expressions

Some years ago when we migrated Radeox – a wiki engine – to a new Regular Expression engine, we had to change lots of Regular Expressions. Today I would do it in a different way, with Fluent Interfaces. Joshua has a solution for building Regular expressions with a Fluent Interface, about which I wrote before.

Regex socialSecurityNumberCheck =
  new Regex(Pattern.With.AtBeginning
    .Digit.Repeat.Exactly(3)
    .Literal("-").Repeat.Optional
    .Digit.Repeat.Exactly(2)
    .Literal("-").Repeat.Optional
    .Digit.Repeat.Exactly(4)
    .AtEnd);

Not only is this solution easier to understand than using a String for regular expressions, migrating from ORO Regex to JDK Regex for example would be easy with this solution. The solution we chose was to create two engine implementations for ORO and JDK regex: Compiler (JdkCompiler,OroCompiler), MatchResult (JdkMatchResult,OroMatchResult), Matcher (JdkMatcher,OroMatcher) and so on. For example:

public class OroCompiler extends Compiler {
  private Perl5Compiler internalCompiler;

  private boolean multiline;

  public OroCompiler() {
    this.internalCompiler = new Perl5Compiler();
  }

  public void setMultiline(boolean multiline) {
    this.multiline = multiline;
  }

  public Pattern compile(String regex) {
    org.apache.oro.text.regex.Pattern pattern = null;
    try {
      pattern = internalCompiler.compile(regex,
         (multiline ? Perl5Compiler.MULTILINE_MASK : Perl5Compiler.SINGLELINE_MASK) | Perl5Compiler.READ_ONLY_MASK);
    } catch (MalformedPatternException e) {
      // handle later
    }

    return new OroPattern(regex, multiline, pattern);
  }
}

The actual regular expressions are stored in Java .properties files, to make them easily exchangable. The bold expression looks like this for example:

filter.bold.match=(^|>|[\\p{Punct}\\p{Space}]+)__(.*?)__([\\p{Punct}\\p{Space}]+|<|$)

With the migration of Radeox to a new regex implementation I had to write and implement several interfaces and it wasn't very easy to both support ORO and the JDK implementation. It wan't easy either to change the slightly different pattern dialact from ORO to JDK. With a Fluent Interface, just change the Fluent Interface, no need to think about different Java Regex APIs and to think about different syntaxes.

Thanks for listening.

You can leave a Reply here. Of course, you should follow me on twitter here.

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

About the author: Stephan Schmidt has more than 15 years of internet technology experience and 10 years experience in agile. He was head of development, consultant and CTO and is a speaker, author and blog writer. He specializes in organizing and optimizing software development helping companies by increasing productivity with lean software development and agile methodologies. Want to know more? All views are only his own.
Leave a reply.

Leave a Reply

What people wrote somewhere else:

Additional comments powered by BackType

Guide to CodeMonkeyism

Over the last 4 years I wrote many articles on this blog. To make it easier for you to find the relevant ones, I've organized them into topics.

Top 10

6 reasons why my VC funded startup did fail

Go Ahead: Next Generation Java Programming Style

Java Interview questions: Write a String Reverser

The dark side of NoSQL

7 Bad Signs not to Work for a Software Company or Startup

Is Java dead?

Scala vs. Clojure

Never, never, never use String in Java

No future for functional programming in 2008 – Scala, F# and Nu

Clojure vs Scala, Part 2

Java Developer

Is Java Dead?

Go Ahead: Next Generation Java Programming Style

Be careful with magical code

All variables in Java must be final

Never, never, never use String in Java

Bending Java: More readable code with methods that do nothing?

NoSQL Guy

NoSQL: The Dawn of Polyglot Persistence

The dark side of NoSQL

Essential storage tradeoff: Simple Reads vs. Simple Writes

Sharding destroys the goals of your relational database

The unholy legacy of databases

Startup/CTO

Development Dream Teams

6 reasons why my VC funded startup did fail

American vs. European style of Software Development

12 Things to Reduce Your Lead Time and Time to Market

The high cost of overhead when working in parallel

Essential storage tradeoff: Simple Reads vs. Simple Writes

Job Seeker

Another Good (Java) Interview Question

7 Bad Signs not to Work for a Software Company or Startup

Java Interview questions: Write a String Reverser (and use Recursion!)

Java Interview questions: Multiple Inheritance

As a Manager: What I value in developers

Top 10 Tips (+1) to Get a Pay Raise

Agilist

What Developers Need to Know About Agile

5 Practices Better to Change in Your Scrum Implementation

Scrum is not about engineering practices

ScrumMaster and ZenMaster: The joke of certification

What is Trans-Scrum?