Use cURL from your Pwnbox (not the target machine) to obtain the source code of the "https://www.inlanefreight.com" website and filter all unique paths of that domain. Submit the number of these paths as the answer

JulioBed3 · January 27, 2024, 1:58am

I use this :
curl "https://inlanefreight.com" > index.com
sed "s/$\"\|'$$https:\/\/www\.inlanefreight\.com\S*$\1/\1\n\2\n\1/g" index.html | grep "https://www.inlane" | sort -u | wc -l

hail regex

SigScott · January 27, 2024, 3:54pm

This was a difficult one to figure out. I couldn’t do it by myself.

The command that finally worked for me:

Still trying to figure out how this command worked…

Can anyone help explain this? Thanks

NikiRich · January 27, 2024, 4:04pm

shawn904 · February 15, 2024, 11:05am

I am not very clear about the definition of path. Are the following two extracted paths the same path?
I think they are the same as
“h ttps://www.inlanefreight.com/index.php/wp-json/oembed/1.0/embed”

https://www.inlanefreight.com/index.php/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fwww.inlanefreight.com%2F
https://www.inlanefreight.com/index.php/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fwww.inlanefreight.com%2F&format=xml

MrHackerNoob · March 26, 2024, 1:27pm

Very helpful, than you.

dcal · April 24, 2024, 6:59am

I got stuck on this for ages. This is what I did that worked.

curl -s https://www.inlanefreight.com | grep -o 'www\.inlanefreight\.com[^"]*' | awk -F 'www.inlanefreight.com' '{print $2}' | sort | uniq | wc -l

And then there was an entry with just \ so I took 1 off the count (output was 35, took 1 off for 34)

It’s a really dumb question.

ableguy92 · May 17, 2024, 4:17pm

so with some help from the comments above, i tried to also do some few tweaks. though pretty longer, it got the job done

will try to explain.
and please note, while trying to reply, i was alerted that new users are allowed to only reply with 2 links so where you see
THE_DOMAIN_LINK it refers to https://www.inlanefreight.com

curl -s THE_DOMAIN_LINK > freight.txt

1. curl -s https://www.inlanefreight.com > freight.txt
i first saved the content to a file so i would not have to be running curl each time i wanted to try something new

=>grep “THE_DOMAIN_LINK”
piped the saved text to grep to find all appearances of the domain name which resulted in a number of lines but as you already know, it included all the other text before the actual
THE_DOMAIN_LINK. Now my goal was to find a way to extract any text starting with THE_DOMAIN_LINK

=>tr " " “\n” | grep “THE_DOMAIN_LINK”
so i piped it to the tr command to replace all the spaces in each line with a carriage return or new line so i can at least get the lines with THE_DOMAIN_LINK appearing on their own line. of course there was still other text prepending that

=>grep -o “https[^']*”
now this will get all the text starting with “https” and any other text after it. NOTE: i did not go with the hrefs because some of the links were in src or just url(, so this was the best way in my opinion. so the result from this gave me lots of lines beginning with THE_DOMAIN_LINK

=>cut -d ‘"’ -f1
there were some lines with quotes(“) at the end of it so you would have THE_DOMAIN_LINK and THE_DOMAIN_LINK” and to the system those are two separate lines so had to use cut to just sort of separate the text in two with the quote(") as separator and getting the first grouped text with the -f1. this way now i had all lines starting with THE_DOMAIN_LINK and no quotes at the end
**I do believe there’s a better approach here

=>sort -u | wc -l
then finally sorted them out with -u giving unique lines only and wc -l counting the number of lines

surely , there are many solutions better and easier than this. if it were a java or python program, i would have finished this off easily but being a linux newbie, came out with this.

Hope it helps someone
thank you

paleafrican · May 22, 2024, 3:59am

Genious!

sanguine034 · June 12, 2024, 4:55pm

This one worked for me thanks!

yoshi2000003 · June 22, 2024, 7:42am

took me 3 hours to formulate this command

curl -s https://www.inlanefreight.com | grep -oE “href=[\”‘][^\"’]+" | sort -u | wc -l

^ this was what it looks like on forums

copy and paste this
curl -s https://www.inlanefreight.com | grep -oE “href=["‘][^"’]+” | sort -u | wc -l

msallam · June 29, 2024, 2:44pm

Question:

$ curl -s https://www.inlanefreight.com | grep -o 'www\.inlanefreight\.com[^"]*' | awk -F 'www.inlanefreight.com' '{print $2}' | sort | uniq | cut -d? -f1 | grep -v \> | sed 's/\\//g' | sort  > paths.txt; cat paths.txt; cat paths.txt | wc -l; rm paths.txt

Answer:

/
/index.php/about-us/
/index.php/career/
/index.php/comments/feed/
/index.php/contact/
/index.php/feed/
/index.php/news/
/index.php/offices/
/index.php/wp-json/
/index.php/wp-json/oembed/1.0/embed
/index.php/wp-json/wp/v2/pages/7
/wp-content/themes/ben_theme/css/animate.css
/wp-content/themes/ben_theme/css/bootstrap-progressbar.min.css
/wp-content/themes/ben_theme/css/bootstrap.css
/wp-content/themes/ben_theme/css/colors/default.css
/wp-content/themes/ben_theme/css/font-awesome.css
/wp-content/themes/ben_theme/css/jquery.smartmenus.bootstrap.css
/wp-content/themes/ben_theme/css/magnific-popup.css
/wp-content/themes/ben_theme/css/owl.carousel.css
/wp-content/themes/ben_theme/css/owl.transitions.css
/wp-content/themes/ben_theme/images/breadcrumb-back.jpg
/wp-content/themes/ben_theme/js/bootstrap.min.js
/wp-content/themes/ben_theme/js/jquery.smartmenus.bootstrap.js
/wp-content/themes/ben_theme/js/jquery.smartmenus.js
/wp-content/themes/ben_theme/js/navigation.js
/wp-content/themes/ben_theme/js/owl.carousel.min.js
/wp-content/themes/ben_theme/style.css
/wp-includes/css/dist/block-library/style.min.css
/wp-includes/js/jquery/jquery-migrate.min.js
/wp-includes/js/jquery/jquery.min.js
/wp-includes/js/wp-embed.min.js
/wp-includes/js/wp-emoji-release.min.js
/wp-includes/wlwmanifest.xml
/xmlrpc.php
      34

hkpeebles · June 30, 2024, 4:49am

recommend this command:
curl https://www.inlanefreight.com | tr ’ ’ ‘\n’| tr “'” ‘\n’ | tr ‘"’ ‘\n’ |grep https://www.inlanefreight.com/ |sort -u | wc -l
curl is obvious, then replace space with new line and also replace the different quote characters with new lines this is so later the url’s are 1 per line and isolated to look nicer imo, and grep for the site name this will give a result of 1 url per line without any extra characters than sort -u to ensure uniquness than the number of lines is the answer

4RB1T3R · July 1, 2024, 1:12am

I completely agree… The only other thing to do is pay for a subscription that will give you access to the questions walk-through. If any of you have done this, please provide your input on how helpful the walk through option is. I don’t mind putting some money down to get a better quality of learning, but I would like to know that it is worth it first.

venom-pwn · July 27, 2024, 1:54am

The correct answer is ladies and gentleman : curl -s https://www.inlanefreight.com | tr ’ ’ ‘\n’ | tr “'” ‘\n’ | tr ‘"’ ‘\n’ | grep ‘https://www.inlanefreight.com/’ | sort -u | wc -l, Note the forum can some times change ’ for ‘ they are not the same also with " can become “

Cristianus · July 31, 2024, 7:20am

Bruh, why didn’t you use grep -o?

cekosicando · August 9, 2024, 5:05pm

why did you used “tr” 3 times

Fixy03 · August 11, 2024, 10:43am

Hello.
Solved in this way :

curl https://www.inlanefreight.com | grep -oP ‘(https:\/\/www\.inlanefreight\.com[^\s?#"]*)’ | sort | uniq | wc -l

Shmaisanymostafa · August 19, 2024, 2:42pm

didn’t work, can anyone help, it seems the problem in grep regex pattern

TheSaoyer · August 23, 2024, 9:48am

Nope still does nothing.

Os14you · August 25, 2024, 5:19pm

Hello,
My solution is not perfect but it does what we want.

curl "https://www.inlanefreight.com" | tr "\"" "\n" | tr "'" "\n" | grep "https://www.inlanefreight.com" | sort -u | wc -l

Topic		Replies	Views
Use cURL from your Pwnbox (not the target machine) to obtain the source code of the "https://www.inlanefreight.com" website and filter all unique paths of that domain. Submit the number of these paths as the answer Writeups academy	0	182	October 18, 2024
Linux fundamental	4	1349	March 21, 2022
CURL unique path finding Challenges	0	1421	April 16, 2021
Linux Fundamentals - Filter Contents Academy	0	232	February 2, 2024
Linux Fundamentals - Filter Contents Academy	0	34	August 4, 2024

Use cURL from your Pwnbox (not the target machine) to obtain the source code of the "https://www.inlanefreight.com" website and filter all unique paths of that domain. Submit the number of these paths as the answer

Related topics