Alternative to LiteSpeed Cache Warmup Crawler

#21
There is a possibility to provide a workaround, although it would be better if DirectAdmin solved this problem.
You can just change the docroot in the httpd custom config at directadmin. They just use a symlink from private to public which is not a wrong setup.
You can tell your DA users this @LiteCache

Login as admin on DA.
Go to Custom httpd configuration. Edit the httpd file for the domain you want to edit.

Put this in the main top block:

Code:
|?DOCROOT=/home/USERNAME/domains/DOMAIN/public_html|
 
#22
What the hell is this here ! Guys are you serious ? This shit is absolutely useless. I don't know how a serious organization like litespeed allowed this clown and his code written with his ass to be present in this form... it's shocking and disgusting.
 
#23
That’s a strong message. From my perspective, it looks like an enthusiastic user just trying to share a crawler service with the community. I did not test or view it through, as long as it doesn’t contain malware or involve any kind of fraud, I don’t see an issue.
 
#24
What the hell is this here ! Guys are you serious ? This shit is absolutely useless. I don't know how a serious organization like litespeed allowed this clown and his code written with his ass to be present in this form... it's shocking and disgusting.
If you want to criticize something, you are welcome to do so, but spare us the stupid chatter and hate speech.
 
#25
That’s a strong message. From my perspective, it looks like an enthusiastic user just trying to share a crawler service with the community. I did not test or view it through, as long as it doesn’t contain malware or involve any kind of fraud, I don’t see an issue.
The code is obfuscated with ioncube. You dont know if there is malware or something like this. And the visual part of the code is writen like elephants learning to draw. Beauteful, for the elephant...
 
#26
If you want to criticize something, you are welcome to do so, but spare us the stupid chatter and hate speech.
Dude in my career i have seen a lot of coders like you. They always doing only problems. Nothing usefull ...

For people which need fast solution :

If you need you can do script for 5 minutes which can crawl pages with chromium-browser on every linux :

timeout 5s chromium-browser --headless --disable-gpu --no-sandbox \
--virtual-time-budget=5000 \
--user-agent=""Mozilla/5.0 (iPhone; CPU iPhone OS 16_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.2 Mobile/15E148 Safari/604.1" \
--dump-dom "$url"


If someone need the whole script for crawling with sitemap.xml tell me here i will upload it for free and without obfuscation.
 
Last edited:
#28
@gospodinov

You don't understand how LiteSpeed LScache works. Please invest some hours and learn before you post unqualified stuff.
Sure. I spent 5 hours and i can give to the comunity something working without malwares and obfuscated code.

usage : ./cache-warmup.sh https://example.net/sitemap.xml
It works as cron job :

10 7 * * * /warmup-cache/cache-warmup.sh https://example.com/sitemap_index.xml > /dev/null

You can increase paralel jobs - default is 10 and it works good and fast on 4 GB ram vps
MAX_PARALLEL=10
With 10 paralel jobs it warmups around 10k urls per 8 hours. ( Desktop and Mobile )
With this configuration i have x-litespeed-cache: hit - on every checked url
Do we need something better ?
I dont thinks so.
Do we need obfuscated code ?
I dont thinks so.
 

Attachments

#31
Every http request creates a cache copy, but the correct cache copy for the relevant CMS needs more than a User-Agent. You're still a long way from being able to provide a working solution. You know how to use curl and how to parse a sitemap file, but nothing more.
 
#32
Every http request creates a cache copy, but the correct cache copy for the relevant CMS needs more than a User-Agent. You're still a long way from being able to provide a working solution. You know how to use curl and how to parse a sitemap file, but nothing more.
As you can see this is not only http request. Chromium handles ajax requests and we have working cache.
That completetly fit our needs.
 
Top