Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Can ML be used to detect repeated patterns in text?

I am trying to analyze system logs. It’s a messy unstructured text. I would like to detect repeated patterns.

As an example:

Feb 24 06:48:03 circle vpopmail[12039]: vchkpw-pop3: **password fail [EMAIL PROTECTED]:**67.109.191.46 Feb 24 06:49:03 circle vpopmail[12043]: vchkpw-pop3: **password fail [EMAIL PROTECTED]:**67.109.191.46 Feb 24 06:50:03 circle vpopmail[12099]: vchkpw-pop3: **password fail [EMAIL PROTECTED]:**67.109.191.46 Feb 24 08:13:31 circle vpopmail[13042]: vchkpw-pop3: **password fail [EMAIL PROTECTED]:**70.104.21.208 Feb 24 08:13:32 circle vpopmail[13046]: vchkpw-pop3: **password fail [EMAIL PROTECTED]:**70.104.21.208

The pattern can be

pattern = “.password fail [EMAIL PROTECTED]:($ip)”

I didn’t know the pattern in advance. I just discovered by eye-balling the text. You might say, you can tokenize the words and count frequencies. In my case, it’s hard to tokenize and decide on windows of substring. the patterns might vary.

Is there a name for such techniques. I couldn’t find ML techniques appleid to this problem?

submitted by /u/__Julia
[link] [comments]