Use regex to find the source of an image

Recently, I needed to find the source of an image in a blob of text. Since the text was out of my control, the exact location of the tag was unknown. So to solve this problem, I buckled down and finally learned some regex. Regex has been my nemesis for years up until this point. Below is the result.

/src=[\"](.*?)[\"]/i

Looking at it now, I think it could be improved upon to be

/\<img.*?src\=[\"|\']?(.*?)[\"|\']?/i

[Edit: I’m dumb. the first one works. can’t figure out how to make quotes optional. They really shouldn’t be because your shit should be well-formed.]

Basically, my original expression looks for the first src property it finds and returns the text within. Looking back on it, this makes a few assumptions.

1) the first src property is within an image tag

2) the property is surrounded in double quotes

The second one (redone just now and therefore untested), should be better. Namely because It doesn’t make any assumptions.

Point is that I’m now learning regex and am excited. Plus wanted to put this somewhere on the web in case there’s someone like me who needs this and doesn’t understand regex quite yet. Maybe one day I’ll come back and break it down into how and why it does what it does but as of now, I don’t think I can put my understanding of regex into a form someone who isn’t me would understand.

0 comments on “Use regex to find the source of an imageAdd yours →

Leave a Reply