Graceful restart not working in FreeBSD jail

#1
I'm running 1.5.0 in a jail using FreeBSD 11.1 and 1.4.32 in a jail using FreeBSD 10.3
Neither one restarts gracefully from the dashboard - they both just exit.

A command line restart takes about 18s, which is sort of tolerable, but it means 18s of downtime, which is not the same as the zero downtime experience when running on Ubuntu.

Debug mode doesn't tell us much.
2018-10-30 01:50:19.859960 [DEBUG] Graceful Restart...
2018-10-30 01:50:19.860036 [INFO] Pass listener *:7080, copy fd 10 to 1001.
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52063] Start shutting down gracefully ...
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52061] Start shutting down gracefully ...
2018-10-30 01:50:19.860068 [NOTICE] [Child: 52064] Start shutting down gracefully ...
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52059] Start shutting down gracefully ...
2018-10-30 01:50:19.860274 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860180 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860313 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860143 [INFO] Pass listener *:80, copy fd 11 to 1000.
2018-10-30 01:50:19.860059 [NOTICE] [Child: 52060] Start shutting down gracefully ...
2018-10-30 01:50:19.860435 [NOTICE] [Child: 52063] Shut down successfully!
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52065] Start shutting down gracefully ...
2018-10-30 01:50:19.860070 [NOTICE] [Child: 52066] Start shutting down gracefully ...
2018-10-30 01:50:19.860490 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860421 [INFO] Stop listener *:7080.
2018-10-30 01:50:19.860393 [NOTICE] [Child: 52064] Shut down successfully!
2018-10-30 01:50:19.860679 [NOTICE] [Child: 52060] Shut down successfully!
2018-10-30 01:50:19.860233 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860533 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.860608 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860823 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860753 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.860903 [NOTICE] [Child: 52066] Shut down successfully!
2018-10-30 01:50:19.860842 [NOTICE] [Child: 52061] Shut down successfully!
2018-10-30 01:50:19.860933 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860961 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.860722 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.861095 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.861128 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860049 [NOTICE] [Child: 52062] Start shutting down gracefully ...
2018-10-30 01:50:19.861255 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860996 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.861374 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.861323 [NOTICE] [Child: 52062] Shut down successfully!
2018-10-30 01:50:19.861466 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.861584 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860569 [INFO] Stop listener *:80.
2018-10-30 01:50:19.862473 [NOTICE] [Child: 52065] Shut down successfully!
2018-10-30 01:50:19.862545 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.862615 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.863474 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52063 !
2018-10-30 01:50:19.863772 [NOTICE] [AutoRestarter] child process with pid=52063 exited with status=0!
2018-10-30 01:50:19.863833 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52060 !
2018-10-30 01:50:19.864002 [NOTICE] [AutoRestarter] child process with pid=52060 exited with status=0!
2018-10-30 01:50:19.864037 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52066 !
2018-10-30 01:50:19.864197 [NOTICE] [AutoRestarter] child process with pid=52066 exited with status=0!
2018-10-30 01:50:19.864229 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52061 !
2018-10-30 01:50:19.864479 [NOTICE] [AutoRestarter] child process with pid=52061 exited with status=0!
2018-10-30 01:50:19.864514 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52064 !
2018-10-30 01:50:19.864882 [NOTICE] [AutoRestarter] child process with pid=52064 exited with status=0!
2018-10-30 01:50:19.864917 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52065 !
2018-10-30 01:50:19.880116 [NOTICE] [AutoRestarter] child process with pid=52065 exited with status=0!
2018-10-30 01:50:19.880172 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52062 !
2018-10-30 01:50:19.987193 [NOTICE] [AutoRestarter] child process with pid=52062 exited with status=0!
2018-10-30 01:50:28.209034 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52059 !
2018-10-30 01:50:28.209111 [DEBUG] [AutoRestater] Clean up child process with pid: 52192
2018-10-30 01:50:28.316182 [NOTICE] [AutoRestarter] child process with pid=52059 exited with status=0!
2018-10-30 01:50:28.316225 [NOTICE] [PID:52031] Server Stopped!
^C
root@vws8:/usr/local/lsws/logs # ps -ax | grep open
52242 0 R+J 0:00.00 grep open
root@vws8:/usr/local/lsws/logs #

Hints on where to look at this would be appreciated.
I can give you root access to the jail if needed.

I have not tested this outside of a jail, as I didn't have a raw server to install it on.

Danny
 
#2
I've looked into this a bit more.

It's got something to do with the admin tcp socket not being released during clean up, I think.
This is what ps and sockstat look like with the server running

root@vws8:~ # date ; sockstat | grep 7080 ; ps -ax | grep open
Tue Oct 30 13:49:40 AEDT 2018
nobody openlitesp 56252 10 tcp4 111.222.333.4:7080 *:*
root openlitesp 56224 10 tcp4 111.222.333.4:7080 *:*
56224 - SJ 0:00.04 ./litespeed (openlitespeed)
56229 - SJ 0:00.00 ./litespeed (openlitespeed)
56252 - SJ 0:00.01 ./litespeed (openlitespeed)
56253 - SJ 0:00.01 ./litespeed (openlitespeed)
56254 - SJ 0:00.01 ./litespeed (openlitespeed)
56255 - SJ 0:00.01 ./litespeed (openlitespeed)
56256 - SJ 0:00.01 ./litespeed (openlitespeed)
56257 - SJ 0:00.01 ./litespeed (openlitespeed)
56258 - SJ 0:00.01 ./litespeed (openlitespeed)
56259 - SJ 0:00.01 ./litespeed (openlitespeed)
56274 1 R+J 0:00.00 grep open

I clicked Graceful Restart at 13:50:03 and the new startup begins at 13:50:04
Note that the same process ids are running and 56224 is still listening on 7080

root@vws8:~ # date ; sockstat | grep 7080 ; ps -ax | grep open
Tue Oct 30 13:50:04 AEDT 2018
nobody openlitesp 56252 22 tcp4 111.222.333.4:7080 111.222.444.144:8178
root openlitesp 56224 10 tcp4 111.222.333.4:7080 *:*
root openlitesp 56224 1002 tcp4 111.222.333.4:7080 *:*
? ? ? ? tcp4 111.222.333.4:7080 111.222.444.144:8179
56224 - SJ 0:00.04 ./litespeed (openlitespeed)
56229 - SJ 0:00.00 ./litespeed (openlitespeed)
56252 - SNJ 0:00.04 ./litespeed (openlitespeed)
56297 1 S+J 0:00.00 grep open

Then things fall over....
2018-10-30 13:50:04.904872 [DEBUG] Config listener [adminListener] [*:7080]
2018-10-30 13:50:05.226467 [ERROR] HttpListener::start(): Can't listen at address adminListener: Address already in use!
2018-10-30 13:50:05.226504 [ERROR] HttpServer::addListener(adminListener) failed to create new listener
2018-10-30 13:50:05.226534 [ERROR] [config:admin:listener:adminListener] failed to start listener on address *:7080!
2018-10-30 13:50:05.226551 [ERROR] [config:admin:listener] No listener is available for admin virtual host!
2018-10-30 13:50:05.226599 [ERROR] Fatal error in configuration, exit!
 

Pong

Administrator
#3
Can you log a ticket with us with tmp root(full root, not jail user) password? We can take a look. may switch between chroot and non chroot for a test.
 
Last edited:
#4
I was just about to hit Send on a long PM with login details etc, when I realised that all of the listeners were using 0.0.0.0 and that might cause conflicts if you used chroot and not jail.

So I changed the HTTP, HTTPS and ADMIN listeners to the specific IP address of the jail and lo! Graceful Restart now works.

Thanks for the hint!

Danny
 
#6
Hi,

@bluewombat could you explain what you mean with the jail and lo?
I have a similar issue with admin listener and setting the listener to a specific IP address didn't help. My listener is now bound to ANY IP address (*:7080).

The interesting thing is that if I gracefully restart OpenLitespeed, sometimes it works, sometimes it doesn't. But if I stop it first and then start it, then it works.
It looks like the listener is not fully released before the new one is started. But then again, this is OpenLitespeed 1.5.11, so this might be already fixed in newer versions.
I should probably upgrade it :)

And this is the error log:
[INFO] [PlainConf] [httpServerConfig:] start parsing file /usr/local/lsws/conf/httpd_config.conf
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [checkprivatecache 1]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [checkpubliccache 1]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [maxcacheobjsize 10000000]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [maxstaleage 200]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [qscache 1]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [reqcookiecache 1]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [respcookiecache 1]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [ignorereqcachectrl 1]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [ignorerespcachectrl 0]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [enablecache 0]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [expireinseconds 3600]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [enableprivatecache 0]
[INFO] [PlainConf] [httpServerConfig:] module [cache] add param [privateexpireinseconds 3600]
[NOTICE] Loading LiteSpeed/1.5.11 Open
module versions:
modpagespeed 2.2-1.11.33.4
cache 1.62
modinspector 1.1
uploadprogress LSIAPI_VERSION_STRING
mod_security 1.2
(built: Tue Jan 14 20:16:59 UTC 2020) ...
[NOTICE] Using [BoringSSL]
[NOTICE] [ADMIN] server socket: uds://usr/local/lsws/admin/tmp/admin.sock.7673
[NOTICE] [config:server:basics] httpdWorkers: 1, Num of Processors: 1
[NOTICE] chroot is disabled.
[INFO] old priority: 0, new priority: 0
[INFO] [config:server:basics2] For better obscurity, server version number is hidden in the response header.
[NOTICE] [PID: 22465]: forked cgid: 22471
[INFO] [PlainConf] [adminConfig:] start parsing file /usr/local/lsws/admin/conf/admin_config.conf
[INFO] [Module: modcompress 1.1] has been initialized successfully
[INFO] [Module: moddecompress 1.1] has been initialized successfully
[INFO] [Module: cache 1.62] has been initialized successfully
[ERROR] HttpListener::start(): Can't listen at address adminListener: Address already in use!
[ERROR] HttpServer::addListener(adminListener) failed to create new listener
[ERROR] [config:admin:listener:adminListener] failed to start listener on address *:7080!
[ERROR] [config:admin:listener] No listener is available for admin virtual host!
[ERROR] Fatal error in configuration, exit!
[NOTICE] [Child: 943] Start shutting down gracefully ...
[INFO] Stop listener *:80.
[INFO] Stop listener *:443.
[INFO] Stop listener *:7080.
[NOTICE] [Child: 943] Shut down successfully!
[NOTICE] [config:server:basics] httpdWorkers: 1, Num of Processors: 1
 
#7
Hi @papa zulu ,

I agree with your finding that the listener is not fully released by the FreeBSD kernel from binding to *:7080as fast as it should be.
The problem seemed to go away for me when I configured the system to the specific jail IP.

The "lo" in my message is actually an old English word, not used so much any more, meaning something like "wow, look at that."
It was not meant to be confused with lo0 loopback device.

Did you set the website listener to the specific jail IP address, as well as the admin listener?
I needed to change them all - admin, http, https

TBH, I've given up on OLSWS on FreeBSD. I could not get it to build under FreeBSD 12. :(
 
Last edited:
Top