LScache crawlers causing error: Too many open files

oxcid

New Member
#1
Hi,

I'm using DO's 1-click WP OLS droplet. Every time the crawlers start running, I get these types of errors:

[INFO] [uds://tmp/lshttpd/lsphp.sock] Connection error: Too many open files, adjust maximum connections to 34!
[ERROR] [*:443] HttpListener::acceptConnection(): Accept failed:Too many open files!
[ERROR] [128.xxx.xxx.xx:45868#wordpress] [Module:Cache] createEntry failed.
[128.xxx.xxx.xx:45880#wordpress] [Module:Cache] createEntry failed for update stale.

Which in the end sometimes resulted in:

[NOTICE] There are 73 '503 Errors' in last 30 seconds, request a graceful restart

Which is good if successful, however sometimes the restart is not done, and I get a lot further of 503s. I have tried to increase the limit as suggested by this (I even put 10x the number suggested):

https://docs.litespeedtech.com/cloud/wordpress/#how-do-i-fix-a-too-many-open-files-issue

But it only succeeded in "postponing" the errors, not completely fixed it. If at first, every time the crawlers running I get this errors, now the errors only show up after 2-3 days, and after several crawlers run (I set the crawlers to run in 8 hours interval, it's a frequently updated e-commerce site).

Has anybody else experience this, or has an idea what to do? I really don't want to restart the droplet every day.

Thanks in advance.
 

Cold-Egg

Administrator
#2
Hi @oxcid ,

Please try to increase the HTTP/HTTPS connections as well if it's a small number, e.g. set to 10000 or more.
May I ask the server spec and what's your current PHP parameters on web admin settings.
Also what's the crawler settings on lscache plugin now, a screenshot or config share may helps.

Best,
Eric
 

oxcid

New Member
#3
Hi @oxcid ,

Please try to increase the HTTP/HTTPS connections as well if it's a small number, e.g. set to 10000 or more.
May I ask the server spec and what's your current PHP parameters on web admin settings.
Also what's the crawler settings on lscache plugin now, a screenshot or config share may helps.

Best,
Eric
Hi @Cold-Egg ,

Thanks for replying. I've increased the "Max Connections" and "Max SSL Connections" to 20000 / 10000 respectively.

My server spec: 4vCPUs, 8GB/160GB disk. I've also attached a screenshot of lsphp parameters below (I hope this is what you need, let me know if I'm wrong).

For the crawlers settings, I'll get back to you on this when I get home.

Thanks.

Screenshot 2020-01-10 00.10.02.png
 

lsqtwrk

Administrator
#5
Hi,

please try

for [ERROR] [*:443] HttpListener::acceptConnection(): Accept failed:Too many open files!

cat /proc/XXX/limits

where XXX is the PID of OLS process , you can get it by ps -aux | grep openlitespeed

see verify the actual limits it has

for [INFO] [uds://tmp/lshttpd/lsphp.sock] Connection error: Too many open files, adjust maximum connections to 34!

in your screenshot , try increase Max Connection and PHP_LSAPI_CHILDREN=35 , from 35 to a higher number , like 40 or 50


and see if it helps
 
#6
Hi @lsqtwrk ,

Thanks so much for the tips, here's the output of cat /proc/XXX/limits :

Code:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             31803                31803                processes
Max open files            100000               100000               files
Max locked memory         16777216             16777216             bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       31803                31803                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
I think we're on to something here, as I see the "Max open files" value is 100000, while the output of ulimit -n is 327680, so there's a difference. Do you know what caused this? I set the value through /etc/security/limits.conf file.

For the Max Connections and PHP_LSAPI_CHILDREN, I've set them to 40.

Thanks.
 

lsqtwrk

Administrator
#7
well , you have 100k on open files , that should be more than enough.

ulimit -n is showing the user's limit , not the process' limit, that's why you see it different

but I think either 100k or 327k , it should be more than enough for single process to use.

if you still see that issue , please try edit file /etc/systemd/system.conf , find #DefaultLimitNOFILE= and change it to DefaultLimitNOFILE=327680 save the change and reboot server

I don't think this is related, but just please give a shot.


Also if you launched from market image , please make sure you have updated to latest OLS version.

https://openlitespeed.org/kb/install-from-binary/

try 1.6.5 or 1.5.10
 
#8
well , you have 100k on open files , that should be more than enough.

ulimit -n is showing the user's limit , not the process' limit, that's why you see it different

but I think either 100k or 327k , it should be more than enough for single process to use.

if you still see that issue , please try edit file /etc/systemd/system.conf , find #DefaultLimitNOFILE= and change it to DefaultLimitNOFILE=327680 save the change and reboot server

I don't think this is related, but just please give a shot.


Also if you launched from market image , please make sure you have updated to latest OLS version.

https://openlitespeed.org/kb/install-from-binary/

try 1.6.5 or 1.5.10
Thanks @lsqtwrk , for now I'll keep an eye on the logs, it's been 3 days with the setup suggested, and so far so good, not a single "Too many open files" error..

Currently using version 1.5.10, I do get notified of the new version 1.6.5, but I think I'll wait until 1.6.x becomes available at the repo.

Thanks again!
 
#9
Just an update, after about 2 weeks of smooth sailing, today I got the same error as I've mentioned at the start, Too many open files, and the server restarted gracefully. This error happened around the time the crawler about to finish it's run, maybe 2-3 minutes into it.

I've updated to OLS 1.5.11, as I see there are some improvements regarding cache management. But aside from that, not sure what to do..

Thanks.
 
#11
Hi @lsqtwrk ,

From "/usr/local/lsws/cachedata/priv" I run:

Code:
find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n
and got this response:

Code:
    525 f
    533 3
    538 d
    552 b
    555 c
    565 5
    569 4
    571 a
    572 e
    585 6
    588 0
    593 8
    595 1
    616 9
    622 2
    623 7
However, this is after I manually delete the cachedata content this morning, using this guide:

https://openlitespeed.org/kb/litespeed-cache-on-openlitespeed-without-plugins/

Previously, each of the directories inside cachedata contains roughly 5000-6000 files. I assumed the expired cache files are not deleted, hence the "Too many open files" error. But this is purely just an assumption, no basis whatsoever.

Thanks.
 

lsqtwrk

Administrator
#12
what about the /usr/local/lsws/cachedata ?

how many files inside of that ?

my guess is that , even the cache is expired or purged , since the file is still there , it's still mapped with OLS process, so if you have a huge number of cached file there , might trigger that error.

I can not confirm if it's true or not , but please try create a cron that removes file older than X days , and see how it goes ?
 
#13
That's also my thought on this, currently the files number from cachedata is 9877, but this morning, assuming each directories contained 5000 files, cachedata should contained around 80000 files (that's the low number).

I thought about creating a cronjob to clear that up, but I noticed these lines in the 1.5.11 changelog:

- [New Feature] Expired cache entries are now automatically removed to free up space.

and

- [Bug Fix] Fixed cache cleanup not getting called when “storagepath” config setting was not set.

which is exactly like in my server, I didn't set the "storagepath" config.

I will keep an eye on the files number, but in case of I do need to set a cleanup cronjob, would this be ok?

Code:
find /usr/local/lsws/cachedata/ -type f -mmin +8 -delete
What would be the ideal expiration limit if my settings are:

Default Public Cache TTL: 86400 sec.
Default Private Cache TTL: 1800 sec.
Default Front Page TTL: 86400 sec.
Default Feed TTL: 0 sec.
Default 404, 403, 500 page TTL: 3600 sec.

Thanks.
 

mgd

New Member
#16
Hi,

I also have the "too many open files" problem with my site that has about 600.000 different pages in the google index. I like to cache these pages but I dont know how to increase the open file limit in my debian system properly.

I did the changes in /etc/security/limits.conf and ulimit -n shows me the correct numbers (999999). But the open file limit for the openlitespeed process did not change:

Code:
Limit                     Soft Limit           Hard Limit           Units
Max open files            50000                524288               files
Are there any hints how to increase that limit too?

I'd like to set the expiration time for the cached pages to 8 hours. Does anybody have suggestions how much RAM of my server this setup will need?

Thanks in advance!
 
Top