rsyslog parent startup failure: error reading "fork pipe"

RSyslog

I’ve been configuring my OpenMediaVault NAS server for RSyslog shipping logs to my centralised RSyslog setver, when I experienced a cryptic error:

rsyslog startup failure: error reading "fork pipe": No such file or directory

rsyslog didn’t start, so it took me a bit to investigate.

Turns out, the issue was mismatch of RSyslog config syntax: OMV used one version, my templates used another.

Specifically, I’m using old-syntax multi-line way of describing global TLS settings for configuring client side of RSyslog:

global(
         DefaultNetstreamDriver="gtls"
         DefaultNetstreamDriverCAFile="/etc/rsyslog.d/ca.crt"
         DefaultNetstreamDriverCertFile="/etc/rsyslog.d/helios4.crt"
         DefaultNetstreamDriverKeyFile="/etc/rsyslog.d/helios4.key"
 )

But earlier in the file I used a more recent way of configuring RSyslog modules:

module(load="imtcp")
input(type="imtcp" port="514")

It seems RSyslog doesn’t suppot this kind of mixing config styles – so one of these config stanzas needs rewriting. In my case, I actually only needed imtcp for debug purposes – so I just commented it out and RSyslog restarted just fine.

See Also




Log fail2ban Messages to Syslog

fail2ban logging into syslog

With quite a few servers accepting SSH connections and protecting themselves using fail2ban, you very quickly recognize one thing: it makes a lot of sense to centralize fail2ban reporting using syslog.

To update fail2ban logging. you need to edit the /etc/fail2ban/fail2ban.conf file and replace this:

logtarget /var/log/fail2ban.log

with this:

logtarget = SYSLOG

Here’s how my section looks when I’m editing a file with vim:

Switching fail2ban log target to SYSLOG

Restart fail2ban service and enjoy:

[email protected]:/var/log # systemctl reload fail2ban

See Also




Host Key Verification Failed

Host key verification failed

When reinstalling servers with new versions of operating system or simply reprovisioning VMs under the same hostname, you eventually get this Host Key Verification Failed scenario. Should be easy enough to fix, once you’re positive that’s a valid infrastructure change.

Host Key Verification

Host key verification happens when you attempt to access remote server with SSH. Before verifying if you have a user on the remote server and whether your password or SSH key match that remote user, SSH client must do basic sanity checks on the lower level.

Specifically, SSH client checks if you attempted connecting to the remote server before. And whether anything changed since last time (it shouldn’t have).

Server (host) keys must not change during a normal life cycle of a server – they are generated at server/VM build stage (when OpenSSH starts up the first time) and remain the same – it’s the server’s identity.

This means if your SSH client has one keyprint for a particular server, and then suddenly detects it’s a different one – it’s flagged as an issue: at best, you’re looking at the new, legit server replacement with the same hostname. At worst, someone’s trying to intercept your connection and/or pretend to be your server.



Host Key Verification Failed

Here’s how I get this error on my Macbook (s1.unixtutorial.org doesn’t really exist, it’s just a hostname I show here as example):

[email protected]:~ $ ssh s1.unixtutorial.org
Warning: the ECDSA host key for 's1.unixtutorial.org' differs from the key for the IP address '51.159.18.142'
Offending key for IP in /Users/greys/.ssh/known_hosts:590
Matching host key in /Users/greys/.ssh/known_hosts:592
Are you sure you want to continue connecting (yes/no)?

At this stage your default answer should always be “no”, followed by inspection of the known_hosts file to confirm what happened and why identity appears to be different.

If you answer no, you’ll get the Host Key Verification Failed error:

[email protected]:~ $ ssh s1.unixtutorial.org
Warning: the ECDSA host key for 's1.unixtutorial.org' differs from the key for the IP address '51.159.18.142'
Offending key for IP in /Users/greys/.ssh/known_hosts:590
Matching host key in /Users/greys/.ssh/known_hosts:592
Are you sure you want to continue connecting (yes/no)? no
Host key verification failed.

How To Solve Host Key Verification Errors

The output above actually tells you what to do: inspect file known_hosts and look at the lines 590 and 592 specifically. One of them is likely to be obsolete, and if you remove it the issue will go away.

Specifically, if you (like me) just reinstalled the dedicated server or VM with a new OS but kept the original hostname, then the issue is expected (new server definitely generated a new host key), so the solution is indeed to remove old key from the known_hosts file and re-attempt the connection.

First, I edited the /Users/greys/.ssh/known_hosts file and removed the line 590, which looked something like this. We simply need to find the line with given number, or look for the hostname we just tried to ssh into (s1.unixtutorial.org in my case):

s1.unixtutorial.org,51.159.xx.yy ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlkzdHAyNTYAAAAxyzAgBPbBCXCL5w8

We can try reconnecting now, answer yes and connect to the server:

[email protected]:~ $ ssh s1.unixtutorial.org
The authenticity of host 's1.unixtutorial.org (51.159.xx.yy)' can't be established.
ECDSA key fingerprint is SHA256:tviW39xN2M+4eZOUGi8UFvBZoHKaLaijBA581Nrhjac.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 's1.unixtutorial.org,51.159.xx.yy' (ECDSA) to the list of known hosts.
Activate the web console with: systemctl enable --now cockpit.socket
Last login: Fri Feb  7 21:18:35 2020 from unixtutorial.org
[[email protected] ~]$

As you can see, the output now makes a lot more sense: our SSH client can’t establish authenticity of the remote server s1.unixtutorial.org – this is because we removed any mention of that server from our known_hosts file in previous step. Answering yes adds info about s1.unixtutorial.org, so any later SSH sessions will work just fine:

[email protected]:~ $ ssh s1.unixtutorial.org
Activate the web console with: systemctl enable --now cockpit.socket
Last login: Sat Feb  8 18:31:39 2020 from 93.107.36.193
[[email protected] ~]$

Copying Host Keys to New Server

I should note that in some cases your setup or organisation would require the same host keys to be kept even with server reinstall. In this case, you’ll need to use last know backup of old server to grab SSH host keys from, to re-deploy them onto the new server – I’ll show/explain this in one of the future posts.

See Also




DEBUG: cron keeps piling up in macOS

cron processes piling up in macOS Catalina

So, long story… After upgrading to macOS Catalina my years-old automount.sh script running via cron stopped working. It’s been a long enough journey of fixing the script itself (sudo permissions, PATH variable not having some important directories in it when run as a script), but after script was fixed I faced another problem: cron processes keep piling up.

Why is this a problem? Eventually, my Macbook would end up with having more than 10 thousand (!) cron related processes and would just run out of process space – no command can be typed, no app can be started. Only shutdown and power on would fix this.

I’ve been looking at this problem for quite some time, and now that I’m closer to solving it I’d like to share first findings.

What is this cron thing?

Don’t remember if I mentioned cron much on Unix Tutorial, so here’s a brief summary: cron is a system service that helps you schedule and regularly run commands. It has crontabs: files which list recurrence pattern and the command line to run.

Here’s an example of a crontab, each asterisk represents a parameter like “day of the week”, “hour”, “minute”, etc. Asterisk means “every value”, so this below would run my script every minute:

* * * * /Users/greys/scripts/try.sh

And here’s my automounter script, it runs every 15 minutes (so I’m specifying all the valid times with 15min interval – 0 minutes, 15 minutes, 30 minutes and 45 minutes):

0,15,30,45 * * * * /Users/greys/scripts/automount.sh

Every user on your Unix-like system can have a crontab (and yes, there’s a way to prohibit cron use for certain users), and usually root or adm user has lots of OS specific tidy-up scripts in Linux and Solaris systems.

The thing with cron is it’s supposed to be this scheduler that runs your tasks regularly and then always stays in the shadows. It’s not meant to be piling processes up, as long as your scripts invoked from cron are working correctly.

Debugging cron in macOS

Turns out, /usr/sbin/cron has quite a few options for debugging in macOS:

 -x debugflag[,...]
         Enable writing of debugging information to standard output.  One or more of the
         following comma separated debugflag identifiers must be specified:

         bit   currently not used
         ext   make the other debug flags more verbose
         load  be verbose when loading crontab files
         misc  be verbose about miscellaneous one-off events
         pars  be verbose about parsing individual crontab lines
         proc  be verbose about the state of the process, including all of its offspring
         sch   be verbose when iterating through the scheduling algorithms
         test  trace through the execution, but do not perform any actions

What I ended up doing is:

Step 1: Kill all the existing crons

mcfly:~ greys$ sudo pkill cron

Step 2: Quickly start an interactive debug copy of cron as root

mcfly:~ root# /usr/sbin/cron -x ext,load,misc,pars,proc,sch

When I say “quickly” I’m referring to the fact that cron service is managed by launchd in macOS, meaning you kill it and it respawns pretty much instantly.

So I would get this error:

mcfly:~ root# /usr/sbin/cron -x ext,load,misc,pars,proc,sch
-sh: kill: (23614) - No such process
debug flags enabled: ext sch proc pars load misc
log_it: (CRON 24156) DEATH (cron already running, pid: 24139)
cron: cron already running, pid: 24139

And the approach I took is kill that last running process and restart cron in the same command line:

mcfly:~ root# kill -9 24281; /usr/sbin/cron -x ext,load,misc,pars,proc,sch
debug flags enabled: ext sch proc pars load misc
[24299] cron started
[24299] load_database()
        greys:load_user()
linenum=1
linenum=2
linenum=3
linenum=4
linenum=5
linenum=6
linenum=7
load_env, read <* * * * * /Users/greys/scripts/try.sh &> /dev/null>
load_env, parse error, state = 7
linenum=0
load_entry()…about to eat comments
linenum=1
linenum=2
...

I’ll admit: this is probably way too much information, but when you’re debugging an issue there’s no such thing as too much – you’re getting all the clues you can get to try and understand the problem.

In my case, nothing was found: cron would start my cronjob, let it finish, report everything was done correctly and then still somehow leave an extra process behind:

[17464] TargetTime=1579264860, sec-to-wait=0
[17464] load_database()
[17464] spool dir mtime unch, no load needed.
[17464] tick(41,12,16,0,5)
user [greys:greys::…] cmd="/Users/greys/scripts/try.sh"
[17464] TargetTime=1579264920, sec-to-wait=60
[17464] do_command(/Users/greys/scripts/try.sh, (greys,greys,))
[17464] main process returning to work
[17464] TargetTime=1579264920, sec-to-wait=60
[17464] sleeping for 60 seconds
[17473] child_process('/Users/greys/scripts/try.sh')
[17473] child continues, closing pipes
[17473] child reading output from grandchild
[17474] grandchild process Vfork()'ed
log_it: (greys 17474) CMD (/Users/greys/scripts/try.sh)
[17473] got data (56:V) from grandchild

Here’s how the processes would look:

0 17464 17213   0 12:40pm ttys003    0:00.01 /usr/sbin/cron -x ext,load,misc,pars,proc,sch
0 17473 17464   0 12:41pm ttys003    0:00.00 /usr/sbin/cron -x ext,load,misc,pars,proc,sch
0 17476 17473   0 12:41pm ttys003    0:00.00 (cron)
0 17520 17464   0 12:42pm ttys003    0:00.00 /usr/sbin/cron -x ext,load,misc,pars,proc,sch
0 17523 17520   0 12:42pm ttys003    0:00.00 (cron)

How To Avoid Crons Piling Up in Catalina

I’m still going to revisit this with a proper fix, but there’s at least an interim one identified for now: you must forward all the output from each cronjob to /dev/null.

In daily (Linux-based) practice, I don’t redirect cronjobs output because if there’s any output generated – it’s likely an error that I want to know about. cron runs a command, and if there’s any output, it sends an email to the user who scheduled the command. You see the email, inspect and fix the problem.

But in macOS Catalina, it seems this won’t work without further tuning. Perhaps there are some mailer related permissions missing or something like that, but fact is that any output generated by your cronjob will make cron process keep running (even though your cronjob script has completed successfully).

So the temporary fix for me was to turn my crontab from this:

0,15,30,45 * * * * /Users/greys/scripts/automount.sh
* * * * /Users/greys/scripts/try.sh

to this:

0,15,30,45 * * * * /Users/greys/scripts/automount.sh >/dev/null 2>&1                                             * * * * /Users/greys/scripts/try.sh >/dev/null 2>&1

That’s it for now! I’m super glad I finally solved this – took a few sessions of reviewing/updating my script because frankly I focused on the script and not on the OS itself.

See Also




systemd services Status

Example of systemctl status

I’ve just learned by accident that it’s possible to run systemctl status without specifying a name of systemd service – this way you get the listing and status of all the services available in a neat tree structure.

SystemD services

As you may remember, startup services are no longer managed by /etc/init.d scripts in Linux. Instead systemd services are created – this is handy for both managing services and confirming their status (journalctl is great for showing latest status messages like error log).

Show systemd Services Status with systemctl

Run without any parameters, systemctl status command will show you a tree structure like this:

[email protected]:~$ systemctl status
 ● sd-147674
     State: running
      Jobs: 0 queued
    Failed: 0 units
     Since: Sat 2019-11-23 08:45:20 CET; 1 months 20 days ago
    CGroup: /
            ├─user.slice
            │ └─user-1000.slice
            │   ├─[email protected]
            │   │ └─init.scope
            │   │   ├─19250 /lib/systemd/systemd --user
            │   │   └─19251 (sd-pam)
            │   └─session-1309.scope
            │     ├─19247 sshd: greys [priv]
            │     ├─19264 sshd: [email protected]/0
            │     ├─19265 -bash
            │     ├─19278 systemctl status
            │     └─19279 pager
            ├─init.scope
            │ └─1 /sbin/init
            └─system.slice
              ├─systemd-udevd.service
              │ └─361 /lib/systemd/systemd-udevd
              ├─cron.service
              │ └─541 /usr/sbin/cron -f
              ├─bind9.service
              │ └─587 /usr/sbin/named -u bind
              ├─systemd-journald.service
              │ └─345 /lib/systemd/systemd-journald
              ├─mdmonitor.service
              │ └─484 /sbin/mdadm --monitor --scan
              ├─ssh.service
              │ └─599 /usr/sbin/sshd -D
              ├─openntpd.service
              │ ├─634 /usr/sbin/ntpd -f /etc/openntpd/ntpd.conf
              │ ├─635 ntpd: ntp engine
              │ └─637 ntpd: dns engine
              ├─rsyslog.service
              │ └─542 /usr/sbin/rsyslogd -n -iNONE
...

In this output, you can see systemd service names like cron.server or ssh.service and then under them is the process name and numerical process ID that indicate the how each service is provided.

INTERESTING: Note how openNTPd.service is provided by 3 separate processes: ntpd and two other ntpd processes (NTP engine and DNS engine).

See Also




Check Config Before Restarting SSH Daemon

Super quick advice today, but one of them pearls of experience that now and then saves your day. Learn how to check and confirm your recent changes to SSH daemon config file (/etc/ssh/sshd_config) won’t break your remote SSH access.

Why Double-Checking Configs Is A Good Idea

I should probably start a special section of Unix Tutorial someday, just to talk about how and when things can go wrong. These things below would certainly belong to that section.

Why it’s a good idea to check that your new config file is error free:

  • avoid getting service outage (syntax error means service won’t restart)
  • prevent service malfunction (if you end up with only partial service functionality)
  • don’t get yourself locked out of service (or server, in case of broken SSH)

How To Check SSHd Config

I have shown you before how to test new SSHd config on a different port, but there’s also a way to check primary config.

Here’s how you do it:

[email protected]:~ $ sudo sshd -t 

It will either return nothing, or complain about errors or highlight deprecated options, like this:

[email protected]:~ $ sudo sshd -t
/etc/ssh/sshd_config line 56: Deprecated option RSAAuthentication

That’s all there is to it, enjoy!

See Also




preserve-root flag for rm command

rm will warn you instead of removing everything in / recursively

rm command, the one used to delete files and directories, can be very dangerous if used with root privileges. It’s comforting to know that most modern rm implementations attempt to help you avoid a complete disaster (of deleting everything on your OS disk).

What preserve-root Flag Does

Default behaviour of rm command in Linux for quite some time, preserve-root flag means that if you attempt to recursively remove the root (/) directory you would get a warning:

[email protected]:/ $ rm -rf /
rm: it is dangerous to operate recursively on '/'
rm: use --no-preserve-root to override this failsafe

Why would this be dangerous? Because every directory and filesystem in Unix and Linux is mounted off root (/) path. So if you remove files/directories there recursively, you may wipe out your entire operating system (and quite a bit of mounted filesystems, if your’e really out of luck).

Now, I’m running this command as my own regular user in the example above. So even if rm wasn’t protecting me, I would still be unable to do any real harm to the OS due to core OS files being protected from accidental removal by regular users. But if I was running as root, it would have been really dangerous.

Why preserve-root is Really Useful

Of course, most of us would never consciously attempt removing everything under /, but here’s a very typical scenario that is quite common with beginners: using unitialised variables (empty values).

[email protected]:/ $ MYDIR=mydir
[email protected]:/ $ echo rm -rf /${MYDIR}
rm -rf /mydir
[email protected]:/ $ rm -rf /${MYDIR}

In this example above, I have a variable called MYDIR, which points to a directory. I’m runnign echo command first to verify what rm command will look like – and it seems correct, so I attempt it.

But if I forget to initialise the MYDIR variable, its value will be empty, meaning my command will become much more dangerous:

[email protected]:/ $ MYDIR=
[email protected]:/ $ echo rm -rf /${MYDIR}
rm -rf /
[email protected]:/ $ rm -rf /${MYDIR}
rm: it is dangerous to operate recursively on '/'
rm: use --no-preserve-root to override this failsafe

See Also




OpenMediaVault Default Login

OpenMediaVault
OpenMediaVault

With Debian 10 Buster release out of the way, next major release of OpenMediaVault – OMV5 – is still on track for Q4/2019 release. This makes it a perfect timing to refresh install procedures in my mind and to capture the default login.

OpenMediaVault Default Login

  • Username: admin
  • Password: openmediavault

How To Reset Admin Password in OpenMediaVault

You may remember, I have a post explaining how to reset admin password in OpenMediaVault – you still need ssh access to the server as regular user though.

See Also




brew Command Not Found

Homebrew for MacOS
Homebrew for MacOS

Quite a few visitors arrived at this blog lately with their reports of “brew command not found“, so I figured a quick post would probably help.

brew is a very popular package manager

Although Homebrew is very popular on MacOS, it’s not a standard tool and not one of the MacOS Commands, so brew does not come preinstalled with your MacOS.

So when you get an error about brew not found, this is quite normal and simply means you’ve never used this software manager before – meaning you need to install it.

How To Install brew in MacOS

The official Homebrew website tells that you simply need to run this command to get started:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Here’s how it will probably look:

Once you press Enter, brew install script will download the latest code for brew and deploy it:

The really cool and clever thing about Homebrew is that going forward brew will be auto-updating itself, so you’re pretty much guaranteed to be running the latest and greated code.

That’t is for now. Chat later!

See Also




Troubleshooting: ifconfig Not Found in Debian

I’ve actually written about ifconfig not found before, but noticed recently another possible scenario so will mention it today.

Make Sure Net-Tools Package is Installed

This is still the most common reason for your shell not finding the ifconfig command: it’s just not installed by default. So install it using apt command:

$ apt install net-tools

Call ifconfig Using Full Path

This is the thing I noticed on my Debian VM earlier today: your regular user may not have /sbin directory in its PATH. Which means ifconfig command will still not work if you just type the command name:

[email protected]:~$ ifconfig
-bash: ifconfig: command not found
You have new mail in /var/mail/greys

But if you type the full path to the command, it will work:

[email protected]:~$ /sbin/ifconfig
enp0s3: flags=4163 mtu 1500
inet 192.168.1.XX netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::a00:27ff:febe:8a41 prefixlen 64 scopeid 0x20
ether 08:00:27:be:8a:41 txqueuelen 1000 (Ethernet)
RX packets 26263716 bytes 9251567381 (8.6 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 131362 bytes 12206621 (11.6 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Update PATH Variable to Include /sbin

Edit the .profile file in your home directory. For me it’s /home/greys/.profile.

Somewhere towards the end of it there should be a PATH variable updates section, on my VM it includes linuxbrew that I installed recently:

set PATH so it includes user's private bin if it exists

if [ -d "$HOME/bin" ] ; then
     PATH="$HOME/bin:$PATH"
 fi

PATH=$PATH:/home/greys/.linuxbrew/bin
eval $(/home/greys/.linuxbrew/bin/brew shellenv)

We need to update this section. And if there isn’t one, just create another one at the end of the file. Both changes should aim to add /sbin directory to the PATH variable.

Update the file:

PATH=$PATH:/home/greys/.linuxbrew/bin

… with this:

PATH=$PATH:/home/greys/.linuxbrew/bin:/sbin

… or create new one:

PATH=$PATH:/sbin

Save the file and read it again to test:

[email protected]:~$ source .profile
/home/greys/.linuxbrew/Homebrew/Library/Homebrew/brew.sh: line 4: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory

That’s it, type ifconfig and it should work now:

[email protected]:~$ ifconfig
enp0s3: flags=4163  mtu 1500
         inet 192.168.1.XX  netmask 255.255.255.0  broadcast 192.168.1.255
         inet6 fe80::a00:27ff:febe:8a41  prefixlen 64  scopeid 0x20
         ether 08:00:27:be:8a:41  txqueuelen 1000  (Ethernet)
         RX packets 26267641  bytes 9252896750 (8.6 GiB)
         RX errors 0  dropped 0  overruns 0  frame 0
         TX packets 131600  bytes 12231427 (11.6 MiB)
         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

See Also