fix cgroup finding, fix UDError
This commit is contained in:
427
misc/nonascii-nohang.conf
Normal file
427
misc/nonascii-nohang.conf
Normal file
@@ -0,0 +1,427 @@
|
||||
|
||||
This is nohang config file.
|
||||
|
||||
Redesign of this config in progress.
|
||||
|
||||
Lines starting with #, tabs and spaces are comments.
|
||||
|
||||
Lines starting with $ contain obligatory parameters.
|
||||
|
||||
Lines starting with @ contain optional parameters.
|
||||
|
||||
The configuration includes the following sections:
|
||||
|
||||
1. Memory levels to respond to as an OOM threat
|
||||
2. Response on PSI memory metrics
|
||||
3. The frequency of checking the level of available memory
|
||||
(and CPU usage)
|
||||
4. The prevention of killing innocent victims
|
||||
5. Impact on the badness of processes via matching their
|
||||
- names,
|
||||
- cmdlines and
|
||||
- UIDs
|
||||
with regular expressions
|
||||
6. The execution of a specific command instead of sending the
|
||||
SIGTERM signal
|
||||
7. GUI notifications:
|
||||
- OOM prevention results and
|
||||
- low memory warnings
|
||||
8. Output verbosity
|
||||
9. Misc
|
||||
|
||||
Just read the description of the parameters and edit the values.
|
||||
Please restart the program after editing the config.
|
||||
|
||||
#####################################################################
|
||||
|
||||
1. Thresholds below which a signal should be sent to the victim
|
||||
|
||||
Sets the available memory levels at or below which SIGTERM or SIGKILL
|
||||
signals are sent. The signal will be sent if MemAvailable and
|
||||
SwapFree (in /proc/meminfo) at the same time will drop below the
|
||||
corresponding values. Can be specified in % (percent) and M (MiB).
|
||||
Valid values are floating-point numbers from the range [0; 100] %.
|
||||
|
||||
MemAvailable levels.
|
||||
|
||||
mem_min_sigterm = 10 %
|
||||
mem_min_sigkill = 5 %
|
||||
|
||||
SwapFree levels.
|
||||
|
||||
swap_min_sigterm = 10 %
|
||||
swap_min_sigkill = 5 %
|
||||
|
||||
Specifying the total share of zram in memory, if exceeded the
|
||||
corresponding signals are sent. As the share of zram in memory
|
||||
increases, it may fall responsiveness of the system. 90 % is a
|
||||
usual hang level, not recommended to set very high.
|
||||
|
||||
Can be specified in % and M. Valid values are floating-point
|
||||
numbers from the range [0; 90] %.
|
||||
|
||||
zram_max_sigterm = 50 %
|
||||
zram_max_sigkill = 55 %
|
||||
|
||||
#####################################################################
|
||||
|
||||
2. Response on PSI memory metrics (it needs Linux 4.20 and up)
|
||||
|
||||
About PSI:
|
||||
https://facebookmicrosites.github.io/psi/
|
||||
|
||||
Disabled by default (ignore_psi = True).
|
||||
|
||||
ignore_psi = True
|
||||
|
||||
Choose a path to PSI file.
|
||||
By default it monitors system-wide file: /proc/pressure/memory
|
||||
You also can set file to monitor one cgroup slice.
|
||||
For example:
|
||||
psi_path = /sys/fs/cgroup/unified/user.slice/memory.pressure
|
||||
psi_path = /sys/fs/cgroup/unified/system.slice/memory.pressure
|
||||
psi_path = /sys/fs/cgroup/unified/system.slice/foo.service/memory.pressure
|
||||
|
||||
Execute the command
|
||||
find /sys/fs/cgroup | grep -P "memory\.pressure$"
|
||||
to find available memory.pressue files (except /proc/pressure/memory).
|
||||
|
||||
psi_path = /proc/pressure/memory
|
||||
|
||||
Valid psi_metrics are:
|
||||
some_avg10
|
||||
some_avg60
|
||||
some_avg300
|
||||
full_avg10
|
||||
full_avg60
|
||||
full_avg300
|
||||
|
||||
some_avg10 is most sensitive.
|
||||
|
||||
psi_metrics = some_avg10
|
||||
|
||||
sigterm_psi_threshold = 80
|
||||
sigkill_psi_threshold = 90
|
||||
|
||||
psi_post_action_delay = 60
|
||||
|
||||
#####################################################################
|
||||
|
||||
3. The frequency of checking the amount of available memory
|
||||
(and CPU usage)
|
||||
|
||||
Coefficients that affect the intensity of monitoring. Reducing
|
||||
the coefficients can reduce CPU usage and increase the periods
|
||||
between memory checks.
|
||||
|
||||
Why three coefficients instead of one? Because the swap fill rate
|
||||
is usually lower than the RAM fill rate.
|
||||
|
||||
It is possible to set a lower intensity of monitoring for swap
|
||||
without compromising to prevent OOM and thus reduce the CPU load.
|
||||
|
||||
Default values are well for desktop. On servers without rapid
|
||||
fluctuations in memory levels the values can be reduced.
|
||||
|
||||
Valid values are positive floating-point numbers.
|
||||
|
||||
rate_mem = 4000
|
||||
rate_swap = 1500
|
||||
rate_zram = 500
|
||||
|
||||
See also https://github.com/rfjakob/earlyoom/issues/61
|
||||
|
||||
|
||||
Максимальное время сна между проверками памяти.
|
||||
Положительное число.
|
||||
|
||||
max_sleep_time = 3
|
||||
|
||||
Минимальное время сна между проверками памяти.
|
||||
Положительное число, не превышающее max_sleep_time.
|
||||
|
||||
min_sleep_time = 0.1
|
||||
|
||||
#####################################################################
|
||||
|
||||
4. The prevention of killing innocent victims
|
||||
|
||||
Минимальное значение bandess (по умолчанию равно oom_score),
|
||||
которым должен обладать
|
||||
процесс для того, чтобы ему был отправлен сигнал.
|
||||
Позволяет предотвратить убийство невиновных если что-то
|
||||
пойдет не так.
|
||||
|
||||
Valid values are integers from the range [0; 1000].
|
||||
|
||||
min_badness = 20
|
||||
|
||||
Минимальная задержка после отправки соответствующих сигналов
|
||||
для предотвращения риска убийства сразу множества процессов.
|
||||
|
||||
Valid values are non-negative floating-point numbers.
|
||||
|
||||
min_delay_after_sigterm = 0.2
|
||||
min_delay_after_sigkill = 1
|
||||
|
||||
Процессы браузера chromium обычно имеют oom_score_adj
|
||||
200 или 300. Это приводит к тому, что процессы хрома умирают
|
||||
первыми вместо действительно тяжелых процессов.
|
||||
Если параметр decrease_oom_score_adj установлен
|
||||
в значение True, то у процессов, имеющих oom_score_adj выше
|
||||
oom_score_adj_max значение oom_score_adj будет опущено
|
||||
до oom_score_adj_max перед поиском жертвы.
|
||||
|
||||
Enabling the option requires root privileges.
|
||||
Valid values are True and False.
|
||||
Values are case sensitive.
|
||||
|
||||
decrease_oom_score_adj = False
|
||||
|
||||
Valid values are integers from the range [0; 1000].
|
||||
|
||||
oom_score_adj_max = 20
|
||||
|
||||
#####################################################################
|
||||
|
||||
5. Impact on the badness of processes via matching their names,
|
||||
cmdlines or UIDs with regular expressions using re.search().
|
||||
|
||||
See https://en.wikipedia.org/wiki/Regular_expression and
|
||||
https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions
|
||||
|
||||
Enabling this options slows down the search for the victim
|
||||
because the names, cmdlines or UIDs of all processes
|
||||
(except init and kthreads) are compared with the
|
||||
specified regex patterns (in fact slowing down is caused by
|
||||
reading all /proc/*/cmdline and /proc/*/status files).
|
||||
|
||||
Use script `oom-sort` from nohang package to view
|
||||
names, cmdlines and UIDs of processes.
|
||||
|
||||
|
||||
5.1 Matching process names with RE patterns
|
||||
|
||||
Valid values are True and False.
|
||||
|
||||
regex_matching = False
|
||||
|
||||
Syntax:
|
||||
|
||||
@PROCESSNAME_RE badness_adj /// RE_pattern
|
||||
|
||||
New badness value will be += badness_adj
|
||||
|
||||
It is possible to compare multiple patterns
|
||||
with different badness_adj values.
|
||||
|
||||
Example:
|
||||
|
||||
@PROCESSNAME_RE -100 /// ^Xorg$
|
||||
|
||||
@PROCESSNAME_RE -500 /// ^sshd$
|
||||
|
||||
5.2 Matching cmdlines with RE patterns
|
||||
|
||||
A good option that allows fine adjustment.
|
||||
|
||||
re_match_cmdline = False
|
||||
|
||||
@CMDLINE_RE 300 /// -childID|--type=renderer
|
||||
|
||||
@CMDLINE_RE -200 /// ^/usr/lib/virtualbox
|
||||
|
||||
|
||||
5.3 Matching UIDs with RE patterns
|
||||
|
||||
The most slow option
|
||||
|
||||
re_match_uid = False
|
||||
|
||||
@UID_RE -100 /// ^0$
|
||||
|
||||
5.4 Matching CGroup-line with RE patterns
|
||||
|
||||
re_match_cgroup = True
|
||||
|
||||
@CGROUP_RE -50 /// system.slice
|
||||
|
||||
@CGROUP_RE 50 /// foo.service
|
||||
@CGROUP_RE 2000 /// user.slice
|
||||
|
||||
5.5 Matching realpath with RE patterns
|
||||
|
||||
re_match_realpath = False
|
||||
|
||||
@REALPATH_RE 20 /// ^/usr/bin/foo
|
||||
|
||||
Note that you can control badness also via systemd units via OOMScoreAdjust, see
|
||||
https://www.freedesktop.org/software/systemd/man/systemd.exec.html#OOMScoreAdjust=
|
||||
|
||||
#####################################################################
|
||||
|
||||
6. The execution of a specific command instead of sending the
|
||||
SIGTERM signal.
|
||||
|
||||
For processes with a specific name you can specify a command to
|
||||
run instead of sending the SIGTERM signal.
|
||||
|
||||
For example, if the process is running as a daemon, you can run
|
||||
the restart command instead of sending SIGTERM.
|
||||
|
||||
Valid values are True and False.
|
||||
|
||||
execute_the_command = False
|
||||
|
||||
The length of the process name can't exceed 15 characters.
|
||||
The syntax is as follows: lines starting with keyword $ETC are
|
||||
considered as the lines containing names of processes and
|
||||
corresponding commands. After a name of process the triple slash
|
||||
(///) follows. And then follows the command that will be
|
||||
executed if the specified process is selected as a victim. The
|
||||
ampersand (&) at the end of the command will allow nohang to
|
||||
continue runing without waiting for the end of the command
|
||||
execution.
|
||||
|
||||
For example:
|
||||
$ETC mysqld /// systemctl restart mariadb.service &
|
||||
$ETC php-fpm7.0 /// systemctl restart php7.0-fpm.service
|
||||
|
||||
If command will contain $PID pattern, this template ($PID) will
|
||||
be replaced by PID of process which name match with RE pattern.
|
||||
|
||||
Exmple:
|
||||
|
||||
$ETC bash /// kill -KILL $PID
|
||||
|
||||
It is way to send any signal instead of SIGTERM.
|
||||
(run `kill -L` to see list of all signals)
|
||||
|
||||
Also $NAME will be replaced by process name.
|
||||
|
||||
$ETC bash /// kill -9 $PID
|
||||
|
||||
$ETC firefox-esr /// kill -SEGV $PID
|
||||
|
||||
$ETC tail /// kill -9 $PID
|
||||
|
||||
$ETC apache2 /// systemctl restart apache2
|
||||
|
||||
|
||||
#####################################################################
|
||||
|
||||
7. GUI notifications:
|
||||
- OOM prevention results and
|
||||
- low memory warnings
|
||||
|
||||
Включение этой опции требует наличия notify-send в системе.
|
||||
В Debian/Ubuntu это обеспечивается установкой пакета
|
||||
libnotify-bin. В Fedora и Arch Linux - пакет libnotify.
|
||||
Также требуется наличие сервера уведомлений.
|
||||
При запуске nohang от рута уведомления рассылаются всем
|
||||
залогиненным пользователям.
|
||||
See also wiki.archlinux.org/index.php/Desktop_notifications
|
||||
Valid values are True and False.
|
||||
|
||||
gui_notifications = False
|
||||
|
||||
Enable GUI notifications about the low level of available memory.
|
||||
Valid values are True and False.
|
||||
|
||||
gui_low_memory_warnings = False
|
||||
|
||||
Execute the command instead of sending GUI notifications if the value is
|
||||
not empty line. For example:
|
||||
warning_exe = cat /proc/meminfo &
|
||||
|
||||
warning_exe =
|
||||
|
||||
Если значения MemAvailable и SwapFree одновременно будут ниже
|
||||
соотвестствующих значений, то будут отправлены уведомления.
|
||||
|
||||
Can be specified in % (percent) and M (MiB).
|
||||
Valid values are floating-point numbers from the range [0; 100] %.
|
||||
|
||||
mem_min_warnings = 25 %
|
||||
|
||||
swap_min_warnings = 25 %
|
||||
|
||||
Если доля zram в памяти превысит значение zram_max_warnings,
|
||||
то будут отправляться уведомления с минимальным периодом равным
|
||||
min_time_between_warnings.
|
||||
|
||||
zram_max_warnings = 40 %
|
||||
|
||||
Минимальное время между отправками уведомлений в секундах.
|
||||
Valid values are floating-point numbers from the range [1; 300].
|
||||
|
||||
min_time_between_warnings = 15
|
||||
|
||||
Ampersands (&) will be replaced with asterisks (*) in process
|
||||
names and in commands.
|
||||
|
||||
#####################################################################
|
||||
|
||||
8. Verbosity
|
||||
|
||||
Display the configuration when the program starts.
|
||||
Valid values are True and False.
|
||||
|
||||
print_config = False
|
||||
|
||||
Print memory check results.
|
||||
Valid values are True and False.
|
||||
|
||||
print_mem_check_results = False
|
||||
|
||||
Минимальная периодичность печати состояния памяти.
|
||||
0 - печатать все проверки памяти.
|
||||
Неотрицательное число.
|
||||
|
||||
min_mem_report_interval = 60
|
||||
|
||||
Print sleep periods between memory checks.
|
||||
Valid values are True and False.
|
||||
|
||||
print_sleep_periods = False
|
||||
|
||||
Печатать общую статистику по корректирующим действиям с момента
|
||||
запуска nohang после каждого корректирующего действия.
|
||||
|
||||
print_total_stat = True
|
||||
|
||||
Печатать таблицу процессов перед каждым корректирующим действием.
|
||||
|
||||
print_proc_table = False
|
||||
|
||||
print_victim_info = True
|
||||
|
||||
Максимальная глубина показа родословной жертвы.
|
||||
По умолчанию (1) показывается только родитель - PPID.
|
||||
Целое положительное число.
|
||||
|
||||
max_ancestry_depth = 1
|
||||
|
||||
separate_log = False
|
||||
|
||||
psi_debug = False
|
||||
|
||||
#####################################################################
|
||||
|
||||
9. Misc
|
||||
|
||||
Жертва может не реагировать на SIGTERM.
|
||||
max_post_sigterm_victim_lifetime - это время, при превышении
|
||||
которого жертва получит SIGKILL.
|
||||
Неотрицательные числа.
|
||||
|
||||
max_post_sigterm_victim_lifetime = 10
|
||||
|
||||
Execute the command after sending SIGKILL to the victim if the value is
|
||||
not empty line. For example:
|
||||
post_kill_exe = cat /proc/meminfo &
|
||||
|
||||
post_kill_exe =
|
||||
|
||||
forbid_negative_badness = True
|
||||
|
||||
Reference in New Issue
Block a user