Compare commits

..

No commits in common. "bf477da78041a1fce8a82d04da422ef60e86c556" and "e8f5d211cfe164f59292aae46ca1b143d11e3ab0" have entirely different histories.

2 changed files with 90 additions and 73 deletions

View File

@ -4,6 +4,7 @@
[![Build Status](https://travis-ci.org/hakavlad/nohang.svg?branch=master)](https://travis-ci.org/hakavlad/nohang) [![Build Status](https://travis-ci.org/hakavlad/nohang.svg?branch=master)](https://travis-ci.org/hakavlad/nohang)
![CodeQL](https://github.com/hakavlad/nohang/workflows/CodeQL/badge.svg) ![CodeQL](https://github.com/hakavlad/nohang/workflows/CodeQL/badge.svg)
[![Total alerts](https://img.shields.io/lgtm/alerts/g/hakavlad/nohang.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/hakavlad/nohang/alerts/)
[![Packaging status](https://repology.org/badge/tiny-repos/nohang.svg)](https://repology.org/project/nohang/versions) [![Packaging status](https://repology.org/badge/tiny-repos/nohang.svg)](https://repology.org/project/nohang/versions)
`nohang` package provides a highly configurable daemon for Linux which is able to correctly prevent [out of memory](https://en.wikipedia.org/wiki/Out_of_memory) (OOM) and keep system responsiveness in low memory conditions. `nohang` package provides a highly configurable daemon for Linux which is able to correctly prevent [out of memory](https://en.wikipedia.org/wiki/Out_of_memory) (OOM) and keep system responsiveness in low memory conditions.
@ -43,10 +44,9 @@ Use one of the userspace OOM killers:
- [systemd-oomd](https://man7.org/linux/man-pages/man8/systemd-oomd.service.8.html): Provided by systemd as `systemd-oomd.service` that uses cgroups-v2 and pressure stall information (PSI) to monitor and take action on processes before an OOM occurs in kernel space. It's used by default on [desktop versions of Fedora 34](https://fedoraproject.org/wiki/Changes/EnableSystemdOomd). - [systemd-oomd](https://man7.org/linux/man-pages/man8/systemd-oomd.service.8.html): Provided by systemd as `systemd-oomd.service` that uses cgroups-v2 and pressure stall information (PSI) to monitor and take action on processes before an OOM occurs in kernel space. It's used by default on [desktop versions of Fedora 34](https://fedoraproject.org/wiki/Changes/EnableSystemdOomd).
- [low-memory-monitor](https://gitlab.freedesktop.org/hadess/low-memory-monitor/): There's a [project announcement](http://www.hadess.net/2019/08/low-memory-monitor-new-project.html). - [low-memory-monitor](https://gitlab.freedesktop.org/hadess/low-memory-monitor/): There's a [project announcement](http://www.hadess.net/2019/08/low-memory-monitor-new-project.html).
- [psi-monitor](https://github.com/endlessm/eos-boot-helper/tree/master/psi-monitor): It's used by default on [Endless OS](https://endlessos.com/). - [psi-monitor](https://github.com/endlessm/eos-boot-helper/tree/master/psi-monitor): It's used by default on [Endless OS](https://endlessos.com/).
- `nohang`: nohang is earlyoom on steroids and has many useful features, see below. Maybe this is a good choice for modern desktops and servers if you need fine-tuning. Previously it was used by default on [Garuda Linux](https://garudalinux.org/). - `nohang`: nohang is earlyoom on steroids and has many useful features, see below. Maybe this is a good choice for modern desktops and servers if you need fine-tuning. It's used by default on [Garuda Linux](https://garudalinux.org/).
Use these tools to improve responsiveness during heavy swapping: Use these tools to improve responsiveness during heavy swapping:
- MGLRU patchset is merged in Linux 6.1. Setting `min_ttl_ms` > 50 can help you.
- [le9-patch](https://github.com/hakavlad/le9-patch): [PATCH] mm: Protect clean file pages under memory pressure to prevent thrashing, avoid high latency and prevent livelock in near-OOM conditions. It's kernel-side solution that can fix the OOM killer behavior. - [le9-patch](https://github.com/hakavlad/le9-patch): [PATCH] mm: Protect clean file pages under memory pressure to prevent thrashing, avoid high latency and prevent livelock in near-OOM conditions. It's kernel-side solution that can fix the OOM killer behavior.
- [prelockd](https://github.com/hakavlad/prelockd): Lock executables and shared libraries in memory to improve system responsiveness under low-memory conditions. - [prelockd](https://github.com/hakavlad/prelockd): Lock executables and shared libraries in memory to improve system responsiveness under low-memory conditions.
- [memavaild](https://github.com/hakavlad/memavaild): Keep amount of available memory by evicting memory of selected cgroups into swap space. - [memavaild](https://github.com/hakavlad/memavaild): Keep amount of available memory by evicting memory of selected cgroups into swap space.
@ -118,16 +118,18 @@ To show GUI notifications (optional):
## How to install ## How to install
#### To install on [Fedora](https://src.fedoraproject.org/rpms/nohang/): #### To install on [Fedora](https://src.fedoraproject.org/rpms/nohang/):
```bash
Orphaned for 6+ weeks, not available. $ sudo dnf install nohang-desktop
$ sudo systemctl enable --now nohang-desktop.service
```
#### To install on RHEL 7 and RHEL 8: #### To install on RHEL 7 and RHEL 8:
nohang is avaliable in [EPEL repos](https://fedoraproject.org/wiki/EPEL). nohang is avaliable in [EPEL repos](https://fedoraproject.org/wiki/EPEL).
```bash ```bash
sudo yum install nohang $ sudo yum install nohang
sudo systemctl enable nohang.service $ sudo systemctl enable nohang.service
sudo systemctl start nohang.service $ sudo systemctl start nohang.service
``` ```
To enable PSI on RHEL 8 pass `psi=1` to kernel boot cmdline. To enable PSI on RHEL 8 pass `psi=1` to kernel boot cmdline.
@ -135,18 +137,18 @@ To enable PSI on RHEL 8 pass `psi=1` to kernel boot cmdline.
Use your favorite [AUR helper](https://wiki.archlinux.org/index.php/AUR_helpers). For example, Use your favorite [AUR helper](https://wiki.archlinux.org/index.php/AUR_helpers). For example,
```bash ```bash
yay -S nohang-git $ yay -S nohang-git
sudo systemctl enable --now nohang-desktop.service $ sudo systemctl enable --now nohang-desktop.service
``` ```
#### To install on Ubuntu 20.04/20.10 #### To install on Ubuntu 20.04/20.10
To install from [PPA](https://launchpad.net/~oibaf/+archive/ubuntu/test/): To install from [PPA](https://launchpad.net/~oibaf/+archive/ubuntu/test/):
```bash ```bash
sudo add-apt-repository ppa:oibaf/test $ sudo add-apt-repository ppa:oibaf/test
sudo apt update $ sudo apt update
sudo apt install nohang $ sudo apt install nohang
sudo systemctl enable --now nohang-desktop.service $ sudo systemctl enable --now nohang-desktop.service
``` ```
#### To install on Debian and Ubuntu-based systems: #### To install on Debian and Ubuntu-based systems:
@ -155,23 +157,23 @@ Outdated and buggy nohang v0.1 release was packaged for [Debian 11](https://pack
It's easy to build a deb package with the latest git snapshot. Install build dependencies: It's easy to build a deb package with the latest git snapshot. Install build dependencies:
```bash ```bash
sudo apt install make fakeroot $ sudo apt install make fakeroot
``` ```
Clone the latest git snapshot and run the build script to build the package: Clone the latest git snapshot and run the build script to build the package:
```bash ```bash
git clone https://github.com/hakavlad/nohang.git && cd nohang $ git clone https://github.com/hakavlad/nohang.git && cd nohang
deb/build.sh $ deb/build.sh
``` ```
Install the package: Install the package:
```bash ```bash
sudo apt install --reinstall ./deb/package.deb $ sudo apt install --reinstall ./deb/package.deb
``` ```
Start and enable `nohang.service` or `nohang-desktop.service` after installing the package: Start and enable `nohang.service` or `nohang-desktop.service` after installing the package:
```bash ```bash
sudo systemctl enable --now nohang-desktop.service $ sudo systemctl enable --now nohang-desktop.service
``` ```
#### To install on Gentoo and derivatives (e.g. Funtoo): #### To install on Gentoo and derivatives (e.g. Funtoo):
@ -180,53 +182,53 @@ Add the [eph kit](https://git.sr.ht/~happy_shredder/eph_kit) overlay, for exampl
Then update your repos: Then update your repos:
```bash ```bash
sudo layman -S # if added via layman $ sudo layman -S # if added via layman
sudo emerge --sync # local repo on Gentoo $ sudo emerge --sync # local repo on Gentoo
sudo ego sync # local repo on Funtoo $ sudo ego sync # local repo on Funtoo
``` ```
Install: Install:
```bash ```bash
sudo emerge -a nohang $ sudo emerge -a nohang
``` ```
Start the service: Start the service:
```bash ```bash
sudo rc-service nohang-desktop start $ sudo rc-service nohang-desktop start
``` ```
Optionally add to startup: Optionally add to startup:
```bash ```bash
sudo rc-update add nohang-desktop default $ sudo rc-update add nohang-desktop default
``` ```
#### To install the latest version on any distro: #### To install the latest version on any distro:
```bash ```bash
git clone https://github.com/hakavlad/nohang.git && cd nohang $ git clone https://github.com/hakavlad/nohang.git && cd nohang
sudo make install $ sudo make install
``` ```
Config files will be located in `/usr/local/etc/nohang/`. To enable and start unit without GUI notifications: Config files will be located in `/usr/local/etc/nohang/`. To enable and start unit without GUI notifications:
```bash ```bash
sudo systemctl enable --now nohang.service $ sudo systemctl enable --now nohang.service
``` ```
To enable and start unit with GUI notifications: To enable and start unit with GUI notifications:
```bash ```bash
sudo systemctl enable --now nohang-desktop.service $ sudo systemctl enable --now nohang-desktop.service
``` ```
On systems with OpenRC: On systems with OpenRC:
```bash ```bash
sudo make install-openrc $ sudo make install-openrc
``` ```
To uninstall: To uninstall:
```bash ```bash
sudo make uninstall $ sudo make uninstall
``` ```
## Command line options ## Command line options
@ -406,11 +408,11 @@ Process with highest badness (found in 55 ms):
To view the latest entries in the log (for systemd users): To view the latest entries in the log (for systemd users):
```bash ```bash
sudo journalctl -eu nohang.service $ sudo journalctl -eu nohang.service
#### or #### or
sudo journalctl -eu nohang-desktop.service $ sudo journalctl -eu nohang-desktop.service
``` ```
You can also enable `separate_log` in the config to logging in `/var/log/nohang/nohang.log`. You can also enable `separate_log` in the config to logging in `/var/log/nohang/nohang.log`.
@ -422,7 +424,7 @@ You can also enable `separate_log` in the config to logging in `/var/log/nohang/
Usage: Usage:
```bash ```bash
oom-sort $ oom-sort
``` ```
<details> <details>

View File

@ -7,7 +7,7 @@ from time import sleep, monotonic
from operator import itemgetter from operator import itemgetter
from sys import stdout, stderr, argv, exit from sys import stdout, stderr, argv, exit
from re import search from re import search
from re import error as invalid_re from sre_constants import error as invalid_re
from signal import signal, SIGKILL, SIGTERM, SIGINT, SIGQUIT, SIGHUP, SIGUSR1 from signal import signal, SIGKILL, SIGTERM, SIGINT, SIGQUIT, SIGHUP, SIGUSR1
@ -165,49 +165,41 @@ def memload():
os.kill(self_pid, SIGUSR1) os.kill(self_pid, SIGUSR1)
def parse_zfs_arcstats(): def arcstats():
""" """
Parses '/proc/spl/kstat/zfs/arcstats'.
Returns a dictionary with 'name' as keys and 'data' as values.
""" """
parsed_data = {} with open(arcstats_path, 'rb') as f:
a_list = f.read().decode().split('\n')
with open(arcstats_path, 'r') as as_file: for n, line in enumerate(a_list):
lines = iter(as_file.readlines()) if n == c_min_index:
c_min = int(line.rpartition(' ')[2]) / 1024
elif n == size_index:
size = int(line.rpartition(' ')[2]) / 1024
# consume lines until the header row: elif n == arc_meta_used_index:
for line in lines: arc_meta_used = int(line.rpartition(' ')[2]) / 1024
if 'name' in line and 'data' in line:
break
# Continue iterating over the remaining lines elif n == arc_meta_min_index:
for line in lines: arc_meta_min = int(line.rpartition(' ')[2]) / 1024
if line.strip():
parts = line.split()
name = parts[0]
data_type = parts[1]
data = parts[2]
if data_type == '4':
data = int(data)
parsed_data[name] = data
return parsed_data else:
continue
c_rec = size - c_min
def zfs_arc_available(): if c_rec < 0:
"""returns how many KiB of the zfs ARC are reclaimable""" c_rec = 0
stats = parse_zfs_arcstats()
c_rec = max(stats['size'] - stats['c_min'], 0) meta_rec = arc_meta_used - arc_meta_min
# old zfs: consider arc_meta_used, arc_meta_min if meta_rec < 0:
if 'arc_meta_used' in stats and 'arc_meta_min' in stats: meta_rec = 0
meta_rec = max(stats['arc_meta_used'] - stats['arc_meta_min'], 0) zfs_available = c_rec + meta_rec
return (c_rec + meta_rec) / 1024
# new zfs: metadata is no longer accounted for separately, # return c_min, size, arc_meta_used, arc_meta_min, zfs_available
# https://github.com/openzfs/zfs/commit/a8d83e2a24de6419dc58d2a7b8f38904985726cb
return c_rec / 1024 return zfs_available
def exe(cmd): def exe(cmd):
@ -217,7 +209,7 @@ def exe(cmd):
cmd_num_dict['cmd_num'] += 1 cmd_num_dict['cmd_num'] += 1
cmd_num = cmd_num_dict['cmd_num'] cmd_num = cmd_num_dict['cmd_num']
th_name = threading.current_thread().name th_name = threading.current_thread().getName()
log('Executing Command-{} {} with timeout {}s in {}'.format( log('Executing Command-{} {} with timeout {}s in {}'.format(
cmd_num, cmd_num,
@ -245,11 +237,11 @@ def start_thread(func, *a, **k):
""" run function in a new thread """ run function in a new thread
""" """
th = threading.Thread(target=func, args=a, kwargs=k, daemon=True) th = threading.Thread(target=func, args=a, kwargs=k, daemon=True)
th_name = th.name th_name = th.getName()
if debug_threading: if debug_threading:
log('Starting {} from {}'.format( log('Starting {} from {}'.format(
th_name, threading.current_thread().name th_name, threading.current_thread().getName()
)) ))
try: try:
@ -358,7 +350,7 @@ def pop(cmd):
else: else:
wait_time = 30 wait_time = 30
th_name = threading.current_thread().name th_name = threading.current_thread().getName()
log('Executing Command-{} {} with timeout {}s in {}'.format( log('Executing Command-{} {} with timeout {}s in {}'.format(
cmd_num, cmd_num,
@ -1349,7 +1341,7 @@ def check_mem_and_swap():
sf = int(m_list[swap_free_index].split(':')[1]) sf = int(m_list[swap_free_index].split(':')[1])
if ZFS: if ZFS:
ma += zfs_arc_available() ma += arcstats()
return ma, st, sf return ma, st, sf
@ -1377,7 +1369,7 @@ def meminfo():
md['available'] = mem_available md['available'] = mem_available
if ZFS: if ZFS:
z = zfs_arc_available() z = arcstats()
mem_available += z mem_available += z
md['shared'] = shmem md['shared'] = shmem
@ -3703,7 +3695,7 @@ if 'max_victim_ancestry_depth' in config_dict:
errprint('Invalid max_victim_ancestry_depth value, not integer\nExit') errprint('Invalid max_victim_ancestry_depth value, not integer\nExit')
exit(1) exit(1)
if max_victim_ancestry_depth < 1: if max_victim_ancestry_depth < 1:
errprint('Invalid max_victim_ancestry_depth value\nExit') errprint('Invalud max_victim_ancestry_depth value\nExit')
exit(1) exit(1)
else: else:
missing_config_key('max_victim_ancestry_depth') missing_config_key('max_victim_ancestry_depth')
@ -3966,6 +3958,29 @@ if check_kmsg:
if ZFS: if ZFS:
log('WARNING: ZFS found. Available memory will not be calculated ' log('WARNING: ZFS found. Available memory will not be calculated '
'correctly (issue#89)') 'correctly (issue#89)')
try:
# find indexes
with open(arcstats_path, 'rb') as f:
a_list = f.read().decode().split('\n')
for n, line in enumerate(a_list):
if line.startswith('c_min '):
c_min_index = n
elif line.startswith('size '):
size_index = n
elif line.startswith('arc_meta_used '):
arc_meta_used_index = n
elif line.startswith('arc_meta_min '):
arc_meta_min_index = n
else:
continue
except Exception as e:
log(e)
ZFS = False
while True: while True: