Graceful restart not working in FreeBSD jail

#1
I'm running 1.5.0 in a jail using FreeBSD 11.1 and 1.4.32 in a jail using FreeBSD 10.3
Neither one restarts gracefully from the dashboard - they both just exit.

A command line restart takes about 18s, which is sort of tolerable, but it means 18s of downtime, which is not the same as the zero downtime experience when running on Ubuntu.

Debug mode doesn't tell us much.
2018-10-30 01:50:19.859960 [DEBUG] Graceful Restart...
2018-10-30 01:50:19.860036 [INFO] Pass listener *:7080, copy fd 10 to 1001.
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52063] Start shutting down gracefully ...
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52061] Start shutting down gracefully ...
2018-10-30 01:50:19.860068 [NOTICE] [Child: 52064] Start shutting down gracefully ...
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52059] Start shutting down gracefully ...
2018-10-30 01:50:19.860274 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860180 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860313 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860143 [INFO] Pass listener *:80, copy fd 11 to 1000.
2018-10-30 01:50:19.860059 [NOTICE] [Child: 52060] Start shutting down gracefully ...
2018-10-30 01:50:19.860435 [NOTICE] [Child: 52063] Shut down successfully!
2018-10-30 01:50:19.860050 [NOTICE] [Child: 52065] Start shutting down gracefully ...
2018-10-30 01:50:19.860070 [NOTICE] [Child: 52066] Start shutting down gracefully ...
2018-10-30 01:50:19.860490 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860421 [INFO] Stop listener *:7080.
2018-10-30 01:50:19.860393 [NOTICE] [Child: 52064] Shut down successfully!
2018-10-30 01:50:19.860679 [NOTICE] [Child: 52060] Shut down successfully!
2018-10-30 01:50:19.860233 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860533 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.860608 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860823 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860753 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.860903 [NOTICE] [Child: 52066] Shut down successfully!
2018-10-30 01:50:19.860842 [NOTICE] [Child: 52061] Shut down successfully!
2018-10-30 01:50:19.860933 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860961 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.860722 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.861095 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.861128 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860049 [NOTICE] [Child: 52062] Start shutting down gracefully ...
2018-10-30 01:50:19.861255 [INFO] Stop listener *:80.
2018-10-30 01:50:19.860996 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.861374 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.861323 [NOTICE] [Child: 52062] Shut down successfully!
2018-10-30 01:50:19.861466 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.861584 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.860569 [INFO] Stop listener *:80.
2018-10-30 01:50:19.862473 [NOTICE] [Child: 52065] Shut down successfully!
2018-10-30 01:50:19.862545 [NOTICE] [AdminPHP] stop worker processes
2018-10-30 01:50:19.862615 [NOTICE] [lsphp] stop worker processes
2018-10-30 01:50:19.863474 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52063 !
2018-10-30 01:50:19.863772 [NOTICE] [AutoRestarter] child process with pid=52063 exited with status=0!
2018-10-30 01:50:19.863833 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52060 !
2018-10-30 01:50:19.864002 [NOTICE] [AutoRestarter] child process with pid=52060 exited with status=0!
2018-10-30 01:50:19.864037 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52066 !
2018-10-30 01:50:19.864197 [NOTICE] [AutoRestarter] child process with pid=52066 exited with status=0!
2018-10-30 01:50:19.864229 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52061 !
2018-10-30 01:50:19.864479 [NOTICE] [AutoRestarter] child process with pid=52061 exited with status=0!
2018-10-30 01:50:19.864514 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52064 !
2018-10-30 01:50:19.864882 [NOTICE] [AutoRestarter] child process with pid=52064 exited with status=0!
2018-10-30 01:50:19.864917 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52065 !
2018-10-30 01:50:19.880116 [NOTICE] [AutoRestarter] child process with pid=52065 exited with status=0!
2018-10-30 01:50:19.880172 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52062 !
2018-10-30 01:50:19.987193 [NOTICE] [AutoRestarter] child process with pid=52062 exited with status=0!
2018-10-30 01:50:28.209034 [NOTICE] [AutoRestarter] cleanup children processes and unix sockets belong to process 52059 !
2018-10-30 01:50:28.209111 [DEBUG] [AutoRestater] Clean up child process with pid: 52192
2018-10-30 01:50:28.316182 [NOTICE] [AutoRestarter] child process with pid=52059 exited with status=0!
2018-10-30 01:50:28.316225 [NOTICE] [PID:52031] Server Stopped!
^C
root@vws8:/usr/local/lsws/logs # ps -ax | grep open
52242 0 R+J 0:00.00 grep open
root@vws8:/usr/local/lsws/logs #

Hints on where to look at this would be appreciated.
I can give you root access to the jail if needed.

I have not tested this outside of a jail, as I didn't have a raw server to install it on.

Danny
 
#2
I've looked into this a bit more.

It's got something to do with the admin tcp socket not being released during clean up, I think.
This is what ps and sockstat look like with the server running

root@vws8:~ # date ; sockstat | grep 7080 ; ps -ax | grep open
Tue Oct 30 13:49:40 AEDT 2018
nobody openlitesp 56252 10 tcp4 111.222.333.4:7080 *:*
root openlitesp 56224 10 tcp4 111.222.333.4:7080 *:*
56224 - SJ 0:00.04 ./litespeed (openlitespeed)
56229 - SJ 0:00.00 ./litespeed (openlitespeed)
56252 - SJ 0:00.01 ./litespeed (openlitespeed)
56253 - SJ 0:00.01 ./litespeed (openlitespeed)
56254 - SJ 0:00.01 ./litespeed (openlitespeed)
56255 - SJ 0:00.01 ./litespeed (openlitespeed)
56256 - SJ 0:00.01 ./litespeed (openlitespeed)
56257 - SJ 0:00.01 ./litespeed (openlitespeed)
56258 - SJ 0:00.01 ./litespeed (openlitespeed)
56259 - SJ 0:00.01 ./litespeed (openlitespeed)
56274 1 R+J 0:00.00 grep open

I clicked Graceful Restart at 13:50:03 and the new startup begins at 13:50:04
Note that the same process ids are running and 56224 is still listening on 7080

root@vws8:~ # date ; sockstat | grep 7080 ; ps -ax | grep open
Tue Oct 30 13:50:04 AEDT 2018
nobody openlitesp 56252 22 tcp4 111.222.333.4:7080 111.222.444.144:8178
root openlitesp 56224 10 tcp4 111.222.333.4:7080 *:*
root openlitesp 56224 1002 tcp4 111.222.333.4:7080 *:*
? ? ? ? tcp4 111.222.333.4:7080 111.222.444.144:8179
56224 - SJ 0:00.04 ./litespeed (openlitespeed)
56229 - SJ 0:00.00 ./litespeed (openlitespeed)
56252 - SNJ 0:00.04 ./litespeed (openlitespeed)
56297 1 S+J 0:00.00 grep open

Then things fall over....
2018-10-30 13:50:04.904872 [DEBUG] Config listener [adminListener] [*:7080]
2018-10-30 13:50:05.226467 [ERROR] HttpListener::start(): Can't listen at address adminListener: Address already in use!
2018-10-30 13:50:05.226504 [ERROR] HttpServer::addListener(adminListener) failed to create new listener
2018-10-30 13:50:05.226534 [ERROR] [config:admin:listener:adminListener] failed to start listener on address *:7080!
2018-10-30 13:50:05.226551 [ERROR] [config:admin:listener] No listener is available for admin virtual host!
2018-10-30 13:50:05.226599 [ERROR] Fatal error in configuration, exit!
 

Pong

Administrator
#3
Can you log a ticket with us with tmp root(full root, not jail user) password? We can take a look. may switch between chroot and non chroot for a test.
 
Last edited:
#4
I was just about to hit Send on a long PM with login details etc, when I realised that all of the listeners were using 0.0.0.0 and that might cause conflicts if you used chroot and not jail.

So I changed the HTTP, HTTPS and ADMIN listeners to the specific IP address of the jail and lo! Graceful Restart now works.

Thanks for the hint!

Danny
 
Top