update translations and cosmetic changes

This commit is contained in:
Alexey Avramov 2018-07-21 05:31:45 +09:00
parent 84c13e9e90
commit efee639e15
3 changed files with 282 additions and 234 deletions

View File

@ -1,19 +1,32 @@
The No Hang Daemon
==================
# Nohang
`Nohang` is a highly configurable daemon for Linux which is able to correctly prevent out of memory conditions.
`Nohang` is a highly configurable daemon for Linux which is able to correctly prevent out of memory conditions and save disk cache.
### What is the problem?
## What is the problem?
OOM killer doesn't prevent OOM conditions.
OOM killer doesn't prevent OOM conditions. And OOM conditions may cause loss disk cache, [freezes](https://en.wikipedia.org/wiki/Hang_(computing)), [livelocks](https://en.wikipedia.org/wiki/Deadlock#Livelock) and killing multiple processes.
### Solutions
"How do I prevent Linux from freezing when out of memory?
- Use of [earlyoom](https://github.com/rfjakob/earlyoom). This is a simple OOM preventer written in C.
- Use of nohang. This is an advanced OOM preventer written in Python.
Today I (accidentally) ran some program on my Linux box that quickly used a lot of memory. My system froze, became unresponsive and thus I was unable to kill the offender.
### Some features
How can I prevent this in the future? Can't it at least keep a responsive core or something running?"
[serverfault](https://serverfault.com/questions/390623/how-do-i-prevent-linux-from-freezing-when-out-of-memory)
"With or without swap it still freezes before the OOM killer gets run automatically. This is really a kernel bug that should be fixed (i.e. run OOM killer earlier, before dropping all disk cache). Unfortunately kernel developers and a lot of other folk fail to see the problem. Common suggestions such as disable/enable swap, buy more RAM, run less processes, set limits etc. do not address the underlying problem that the kernel's low memory handling sucks camel's balls."
[serverfault](https://serverfault.com/questions/390623/how-do-i-prevent-linux-from-freezing-when-out-of-memory)
Also look at "Why are low memory conditions handled so badly?" [r/linux](https://www.reddit.com/r/linux/comments/56r4xj/why_are_low_memory_conditions_handled_so_badly/) - discussion with 480+ posts.
## Solutions
- Use of [earlyoom](https://github.com/rfjakob/earlyoom). This is a simple and lightweight OOM preventer written in C.
- Use of [oomd](https://github.com/facebookincubator/oomd). This is a userspace OOM killer for linux systems whitten in C++ and developed by Facebook.
- Use of nohang.
## Some features
- convenient configuration with a well commented config file (there are 35 parameters in the config)
- `SIGKILL` and `SIGTERM` as signals that can be sent to the victim
@ -24,7 +37,7 @@ OOM killer doesn't prevent OOM conditions.
- possibility of restarting processes via command like `systemctl restart something` if the process is selected as a victim
- look at the [config](https://github.com/hakavlad/nohang/blob/master/nohang.conf) to find more
### Demo
## Demo
[Video](https://youtu.be/DefJBaKD7C8): nohang prevents OOM after the command `while true; do tail /dev/zero; done` has been executed.
@ -58,40 +71,40 @@ MemAvail: 1535 M, 26.1 %
```
And demo: https://youtu.be/5d6UovJzK8k
### Requirements
## Requirements
- `Linux 3.14+` (because the MemAvailable parameter appeared in /proc/meminfo since kernel version 3.14) and `Python 3.4+` (compatibility with earlier versions was not tested) for basic usage
- `libnotify` (Fedora, Arch) or `libnotify-bin` (Debian, Ubuntu) for desktop notifications and `sudo` for desktop notifications as root
### Memory and CPU usage
## Memory and CPU usage
- VmRSS is 10 — 13.5 MiB depending on the settings
- CPU usage depends on the level of available memory (the frequency of memory status checks increases as the amount of available memory decreases) and monitoring intensity (can be changed by user via config)
### Status
## Status
The program is unstable and some fixes are required before the first stable version will be released (need documentation, translation, review and some optimisation).
### Download
## Download
```bash
git clone https://github.com/hakavlad/nohang.git
cd nohang
```
### Installation and start for systemd users
## Installation and start for systemd users
```bash
sudo ./install.sh
```
### Purge
## Purge
```bash
sudo ./purge.sh
```
### Command line options
## Command line options
```
./nohang -h
@ -104,7 +117,7 @@ optional arguments:
./nohang.conf, /etc/nohang/nohang.conf
```
### How to configure nohang
## How to configure nohang
The program can be configured by editing the [config file](https://github.com/hakavlad/nohang/blob/master/nohang.conf). The configuration includes the following sections:
@ -119,7 +132,7 @@ The program can be configured by editing the [config file](https://github.com/ha
Just read the description of the parameters and edit the values. Please restart nohang to apply changes. Default path to the config arter installing via `./install.sh` is `/etc/nohang/nohang.conf`.
### Feedback
## Feedback
Please create [issues](https://github.com/hakavlad/nohang/issues). Use cases, feature requests and any questions are welcome.

293
nohang
View File

@ -6,22 +6,20 @@ import os
from operator import itemgetter
from time import sleep, time
from argparse import ArgumentParser
from subprocess import Popen
# from subprocess import Popen
sig_dict = {9: 'SIGKILL', 15: 'SIGTERM'}
# директория, в которой запущен скрипт
# directory where the script is running
cd = os.getcwd()
# где искать конфиг, если не указан через опцию -c/--config
default_configs = (
cd + '/nohang.conf',
'/etc/nohang/nohang.conf'
)
# where to look for a config if not specified via the -c/--config option
default_configs = (cd + '/nohang.conf', '/etc/nohang/nohang.conf')
# universal message if config is invalid
conf_err_mess = '\nSet up the path to the valid config file with -c/--confi' \
'g option!\nExit'
conf_err_mess = '\nSet up the path to the valid conf' \
'ig file with -c/--config option!\nExit'
# означает, что при задани zram disksize = 10000M доступная память
# уменьшится на 42M
@ -31,6 +29,9 @@ conf_err_mess = '\nSet up the path to the valid config file with -c/--confi' \
# ("zram uses about 0.1% of the size of the disk"
# - https://www.kernel.org/doc/Documentation/blockdev/zram.txt),
# но это утверждение противоречит опытным данным
# zram_disksize_factor = deltaMemAvailavle / disksize
# found experimentally
zram_disksize_factor = 0.0042
name_strip_string = '\'"`\\!-$'
@ -54,16 +55,17 @@ def string_to_int_convert_test(string):
return None
# извлечение праметра из словаря конфига, возврат str
# extracting the parameter from the config dictionary, str return
def conf_parse_string(param):
if param in config_dict:
return config_dict[param].strip()
else:
print('{} not in config\nExit'.format(param))
print('All the necessary parameters must be in the config')
print('There is no "{}" parameter in the config'.format(param))
exit()
# извлечение праметра из словаря конфига, возврат bool
# extracting the parameter from the config dictionary, bool return
def conf_parse_bool(param):
if param in config_dict:
param_str = config_dict[param]
@ -72,22 +74,20 @@ def conf_parse_bool(param):
elif param_str == 'False':
return False
else:
print('Invalid {} value {} (shou' \
'ld be True or False)\nExit'.format(param, param_str))
print('Invalid value of the "{}" parameter.'.format(param_str))
print('Valid values are True and False.')
print('Exit')
exit()
else:
print('{} not in config\nExit'.format(param))
print('All the necessary parameters must be in the config')
print('There is no "{}" parameter in the config'.format(param_str))
exit()
def func_decrease_oom_score_adj(oom_score_adj_max):
# цикл для наполнения oom_list
for i in os.listdir('/proc'):
# пропускаем элементы, не состоящие только из цифр
if i.isdigit() is not True:
continue
try:
oom_score_adj = int(rline1('/proc/' + i + '/oom_score_adj'))
if oom_score_adj > oom_score_adj_max:
@ -99,14 +99,14 @@ def func_decrease_oom_score_adj(oom_score_adj_max):
pass
# чтение первой строки файла
# read 1st line
def rline1(path):
with open(path) as f:
for line in f:
return line[:-1]
# запись в файл
# write in file
def write(path, string):
with open(path, 'w') as f:
f.write(string)
@ -128,12 +128,12 @@ def just_percent_swap(num):
return str(round(num * 100, 1)).rjust(5, ' ')
# K -> M, выравнивание по правому краю
# KiB to MiB, right alignment
def human(num, lenth):
return str(round(num / 1024)).rjust(lenth, ' ')
# возвращает disksize и mem_used_total по zram id
# return str with amount of bytes
def zram_stat(zram_id):
try:
disksize = rline1('/sys/block/' + zram_id + '/disksize')
@ -153,7 +153,7 @@ def zram_stat(zram_id):
return disksize, mem_used_total # BYTES, str
# имя через пид
# return process name
def pid_to_name(pid):
try:
with open('/proc/' + pid + '/status') as f:
@ -166,7 +166,7 @@ def pid_to_name(pid):
def send_notify_warn():
# текст отправляемого уведомления
if mem_used_zram > 0:
info = '"<i>MemAvailable:</i> <b>{} MiB</b>\n<i>SwapFree:</i> <b>{} MiB</b>\n<i>MemUsedZram:</i> <b>{} MiB</b>" &'.format(
kib_to_mib(mem_available),
@ -229,8 +229,6 @@ def sleep_after_send_signal(signal):
def find_victim_and_send_signal(signal):
time0 = time()
print(mem_info)
# выставляем потолок для oom_score_adj всех процессов
@ -240,7 +238,7 @@ def find_victim_and_send_signal(signal):
# получаем список процессов ((pid, badness))
oom_list = []
if use_regex_lists:
if regex_matching:
for pid in os.listdir('/proc'):
if pid.isdigit() is not True:
@ -250,16 +248,16 @@ def find_victim_and_send_signal(signal):
oom_score = int(rline1('/proc/' + pid + '/oom_score'))
name = pid_to_name(pid)
res = fullmatch(avoidlist_regex, name)
res = fullmatch(avoid_regex, name)
if res is not None:
# тут уже получаем badness
oom_score = int(oom_score / avoidlist_factor)
print(' {} (Pid: {}, Badness {}) matches with avoidlist_regex'.format(name, pid, oom_score)),
oom_score = int(oom_score / avoid_factor)
print(' {} (Pid: {}, Badness {}) matches with avoid_regex'.format(name, pid, oom_score)),
res = fullmatch(preferlist_regex, name)
res = fullmatch(prefer_regex, name)
if res is not None:
oom_score = int((oom_score + 1) * preferlist_factor)
print(' {} (Pid: {}, Badness {}) matches with preferlist_regex'.format(name, pid, oom_score)),
oom_score = int((oom_score + 1) * prefer_factor)
print(' {} (Pid: {}, Badness {}) matches with prefer_regex'.format(name, pid, oom_score)),
except FileNotFoundError:
oom_score = 0
@ -287,7 +285,7 @@ def find_victim_and_send_signal(signal):
# получаем максимальный oom_score
oom_score = pid_tuple_list[1]
if oom_score >= oom_score_min:
if oom_score >= min_badness:
# пытаемся отправить сигнал найденной жертве
@ -326,19 +324,24 @@ def find_victim_and_send_signal(signal):
else:
try:
try: # SUCCESS -> RESPONSE TIME
os.kill(int(pid), signal)
success_time = time()
delta_success = success_time - time0
send_result = ' Success; reaction time: {} ms'.format(round(delta_success * 1000))
send_result = ' Success; response time: {} ms\n'.format(round(delta_success * 1000)) + r'}'
if desktop_notifications:
if gui_notifications:
send_notify(signal, name, pid, oom_score, vm_rss, vm_swap)
except FileNotFoundError:
send_result = ' No such process'
success_time = time()
delta_success = success_time - time0
send_result = ' No such process; response time: {} ms'.format(round(delta_success * 1000))
except ProcessLookupError:
send_result = ' No such process'
success_time = time()
delta_success = success_time - time0
send_result = ' No such process; response time: {} ms'.format(round(delta_success * 1000))
try_to_send = ' Preventing OOM: trying to send the {} signal to {},\n Pid: {}, Badness: {}, VmRSS: {} MiB, VmSwap: {} MiB'.format(sig_dict[signal], name, pid, oom_score, vm_rss, vm_swap)
@ -347,8 +350,12 @@ def find_victim_and_send_signal(signal):
else:
badness_is_too_small = ' oom_score {} < oom_score_min {}'.format(
oom_score, oom_score_min)
success_time = time()
delta_success = success_time - time0
badness_is_too_small = ' oom_score {} < min_badness {}; response time: {} ms'.format(
oom_score, min_badness, round(delta_success * 1000))
print(badness_is_too_small)
@ -391,7 +398,7 @@ for s in mem_list:
mem_list_names.append(s.split(':')[0])
if mem_list_names[2] != 'MemAvailable':
print('Your Linux kernel is too old, 3.14+ requie\nExit')
print('Your Linux kernel is too old, Linux 3.14+ requie\nExit')
exit()
swap_total_index = mem_list_names.index('SwapTotal')
@ -455,7 +462,7 @@ print(config)
##########################################################################
# парсинг конфига с получением словаря параметров
# parsing the config with obtaining the parameters dictionary
# conf_parameters_dict
# conf_restart_dict
@ -463,10 +470,10 @@ print(config)
try:
with open(config) as f:
# словарь с параметрами конфига
# dictionary with config options
config_dict = dict()
# словарь с именами и командами для параметра execute_the_command
# dictionary with names and commands for the parameter execute_the_command
etc_dict = dict()
for line in f:
@ -487,7 +494,7 @@ try:
etc_name = a[0].strip()
etc_command = a[1].strip()
if len(etc_name) > 15:
print('инвалид конфиг, длина имени процесса не должна превышать 15 символов\nExit')
print('Invalid config, the length of the process name must not exceed 15 characters\nExit')
exit()
etc_dict[etc_name] = etc_command
@ -506,9 +513,9 @@ except IndexError:
##########################################################################
# извлечение параметров из словаря
# проверка наличия всех необходимых параметров
# валидация всех параметров
# extracting parameters from the dictionary
# check for all necessary parameters
# validation of all parameters
print_config = conf_parse_bool('print_config')
@ -523,47 +530,59 @@ print_sleep_periods = conf_parse_bool('print_sleep_periods')
realtime_ionice = conf_parse_bool('realtime_ionice')
if 'realtime_ionice_classdata' in config_dict:
realtime_ionice_classdata = string_to_int_convert_test(
config_dict['realtime_ionice_classdata'])
if realtime_ionice_classdata is None:
print('Invalid realtime_ionice_classdata value, not integer\nExit')
print('Invalid value of the "realtime_ionice_classdata" parameter.')
print('Valid values are integers from the range [0; 7].')
print('Exit')
exit()
if realtime_ionice_classdata < 0 or realtime_ionice_classdata > 7:
print('Invalid realtime_ionice_classdata value\nExit')
print('Invalid value of the "realtime_ionice_classdata" parameter.')
print('Valid values are integers from the range [0; 7].')
print('Exit')
exit()
else:
print('realtime_ionice_classdata not in config\nExit')
print('All the necessary parameters must be in the config')
print('There is no "realtime_ionice_classdata" parameter in the config')
exit()
mlockall = conf_parse_bool('mlockall')
if 'self_nice' in config_dict:
self_nice = string_to_int_convert_test(config_dict['self_nice'])
if self_nice is None:
print('Invalid self_nice value, not integer\nExit')
if 'niceness' in config_dict:
niceness = string_to_int_convert_test(config_dict['niceness'])
if niceness is None:
print('Invalid niceness value, not integer\nExit')
exit()
if self_nice < -20 or self_nice > 19:
print('Недопустимое значение self_nice\nExit')
if niceness < -20 or niceness > 19:
print('Недопустимое значение niceness\nExit')
exit()
else:
print('self_nice not in config\nExit')
print('niceness not in config\nExit')
exit()
if 'self_oom_score_adj' in config_dict:
self_oom_score_adj = string_to_int_convert_test(
config_dict['self_oom_score_adj'])
if self_oom_score_adj is None:
print('Invalid self_oom_score_adj value, not integer\nExit')
if 'oom_score_adj' in config_dict:
oom_score_adj = string_to_int_convert_test(
config_dict['oom_score_adj'])
if oom_score_adj is None:
print('Invalid oom_score_adj value, not integer\nExit')
exit()
if self_oom_score_adj < -1000 or self_oom_score_adj > 1000:
print('Недопустимое значение self_oom_score_adj\nExit')
if oom_score_adj < -1000 or oom_score_adj > 1000:
print('Недопустимое значение oom_score_adj\nExit')
exit()
else:
print('self_oom_score_adj not in config\nExit')
print('oom_score_adj not in config\nExit')
exit()
@ -813,17 +832,17 @@ else:
exit()
if 'oom_score_min' in config_dict:
oom_score_min = string_to_int_convert_test(
config_dict['oom_score_min'])
if oom_score_min is None:
print('Invalid oom_score_min value, not integer\nExit')
if 'min_badness' in config_dict:
min_badness = string_to_int_convert_test(
config_dict['min_badness'])
if min_badness is None:
print('Invalid min_badness value, not integer\nExit')
exit()
if oom_score_min < 0 or oom_score_min > 1000:
print('Недопустимое значение oom_score_min\nExit')
if min_badness < 0 or min_badness > 1000:
print('Недопустимое значение min_badness\nExit')
exit()
else:
print('oom_score_min not in config\nExit')
print('min_badness not in config\nExit')
exit()
@ -844,10 +863,10 @@ else:
exit()
if 'desktop_notifications' in config_dict:
desktop_notifications = config_dict['desktop_notifications']
if desktop_notifications == 'True':
desktop_notifications = True
if 'gui_notifications' in config_dict:
gui_notifications = config_dict['gui_notifications']
if gui_notifications == 'True':
gui_notifications = True
users_dict = dict()
with open('/etc/passwd') as f:
for line in f:
@ -855,15 +874,15 @@ if 'desktop_notifications' in config_dict:
username = line_list[0]
uid = line_list[2]
users_dict[uid] = username
elif desktop_notifications == 'False':
desktop_notifications = False
elif gui_notifications == 'False':
gui_notifications = False
else:
print('Invalid desktop_notifications value {} (shoul' \
print('Invalid gui_notifications value {} (shoul' \
'd be True or False)\nExit'.format(
desktop_notifications))
gui_notifications))
exit()
else:
print('desktop_notifications not in config\nExit')
print('gui_notifications not in config\nExit')
exit()
@ -873,47 +892,47 @@ notify_options = conf_parse_string('notify_options')
root_display = conf_parse_string('root_display')
use_regex_lists = conf_parse_bool('use_regex_lists')
if use_regex_lists:
regex_matching = conf_parse_bool('regex_matching')
if regex_matching:
from re import fullmatch
preferlist_regex = conf_parse_string('preferlist_regex')
prefer_regex = conf_parse_string('prefer_regex')
if 'preferlist_factor' in config_dict:
preferlist_factor = string_to_float_convert_test(config_dict['preferlist_factor'])
if preferlist_factor is None:
print('Invalid preferlist_factor value, not float\nExit')
if 'prefer_factor' in config_dict:
prefer_factor = string_to_float_convert_test(config_dict['prefer_factor'])
if prefer_factor is None:
print('Invalid prefer_factor value, not float\nExit')
exit()
if preferlist_factor < 1 and preferlist_factor > 1000:
print('preferlist_factor должен быть в диапазоне [1; 1000]\nExit')
if prefer_factor < 1 and prefer_factor > 1000:
print('prefer_factor должен быть в диапазоне [1; 1000]\nExit')
exit()
else:
print('preferlist_factor not in config\nExit')
print('prefer_factor not in config\nExit')
exit()
avoidlist_regex = conf_parse_string('avoidlist_regex')
avoid_regex = conf_parse_string('avoid_regex')
if 'avoidlist_factor' in config_dict:
avoidlist_factor = string_to_float_convert_test(config_dict['avoidlist_factor'])
if avoidlist_factor is None:
print('Invalid avoidlist_factor value, not float\nExit')
if 'avoid_factor' in config_dict:
avoid_factor = string_to_float_convert_test(config_dict['avoid_factor'])
if avoid_factor is None:
print('Invalid avoid_factor value, not float\nExit')
exit()
if avoidlist_factor < 1 and avoidlist_factor > 1000:
print('avoidlist_factor должен быть в диапазоне [1; 1000]\nExit')
if avoid_factor < 1 and avoid_factor > 1000:
print('avoid_factor должен быть в диапазоне [1; 1000]\nExit')
exit()
else:
print('avoidlist_factor not in config\nExit')
print('avoid_factor not in config\nExit')
exit()
low_memory_warnings = conf_parse_bool('low_memory_warnings')
gui_low_memory_warnings = conf_parse_bool('gui_low_memory_warnings')
if 'min_time_between_warnings' in config_dict:
@ -1077,22 +1096,22 @@ else:
# повышаем приоритет
try:
os.nice(self_nice)
self_nice_result = 'OK'
os.nice(niceness)
niceness_result = 'OK'
except PermissionError:
self_nice_result = 'Fail'
niceness_result = 'Fail'
pass
# возможность запрета самоубийства
try:
with open('/proc/self/oom_score_adj', 'w') as file:
file.write('{}\n'.format(self_oom_score_adj))
self_oom_score_adj_result = 'OK'
file.write('{}\n'.format(oom_score_adj))
oom_score_adj_result = 'OK'
except PermissionError:
pass
self_oom_score_adj_result = 'Fail'
oom_score_adj_result = 'Fail'
except OSError:
self_oom_score_adj_result = 'Fail'
oom_score_adj_result = 'Fail'
pass
# запрет своппинга процесса
@ -1111,6 +1130,10 @@ self_uid = os.geteuid()
self_pid = os.getpid()
if self_uid == 0:
root = True
decrease_res = 'OK'
@ -1140,11 +1163,11 @@ if print_config:
print('\nII. SELF-DEFENSE [displaying these options need fix]')
print('mlockall: {} ({})'.format(mlockall, mla_res))
print('self_nice: {} ({})'.format(
self_nice, self_nice_result
print('niceness: {} ({})'.format(
niceness, niceness_result
))
print('self_oom_score_adj: {} ({})'.format(
self_oom_score_adj, self_oom_score_adj_result
print('oom_score_adj: {} ({})'.format(
oom_score_adj, oom_score_adj_result
))
print('\nIII. INTENSITY OF MONITORING')
@ -1170,7 +1193,7 @@ if print_config:
print('\nV. PREVENTION OF KILLING INNOCENT VICTIMS')
print('min_delay_after_sigterm: {}'.format(min_delay_after_sigterm))
print('min_delay_after_sigkill: {}'.format(min_delay_after_sigkill))
print('oom_score_min: {}'.format(oom_score_min))
print('min_badness: {}'.format(min_badness))
# False (OK) - OK не нужен когда фолс
print('decrease_oom_score_adj: {} ({})'.format(
@ -1180,22 +1203,22 @@ if print_config:
print('oom_score_adj_max: {}'.format(oom_score_adj_max))
print('\nVI. DESKTOP NOTIFICATIONS')
print('desktop_notifications: {}'.format(desktop_notifications))
if desktop_notifications:
print('gui_notifications: {}'.format(gui_notifications))
if gui_notifications:
print('notify_options: {}'.format(notify_options))
print('root_display: {}'.format(root_display))
print('\nVII. AVOID AND PREFER VICTIM NAMES VIA REGEX')
print('use_regex_lists: {}'.format(use_regex_lists))
if use_regex_lists:
print('preferlist_regex: {}'.format(preferlist_regex))
print('preferlist_factor: {}'.format(preferlist_factor))
print('avoidlist_regex: {}'.format(avoidlist_regex))
print('avoidlist_factor: {}'.format(avoidlist_factor))
print('regex_matching: {}'.format(regex_matching))
if regex_matching:
print('prefer_regex: {}'.format(prefer_regex))
print('prefer_factor: {}'.format(prefer_factor))
print('avoid_regex: {}'.format(avoid_regex))
print('avoid_factor: {}'.format(avoid_factor))
print('\nIX. LOW MEMORY WARNINGS')
print('low_memory_warnings: {}'.format(low_memory_warnings))
if low_memory_warnings:
print('gui_low_memory_warnings: {}'.format(gui_low_memory_warnings))
if gui_low_memory_warnings:
print('min_time_between_warnings: {}'.format(min_time_between_warnings))
print('mem_min_warnings: {} MiB, {} %'.format(
@ -1218,7 +1241,7 @@ if print_config:
##########################################################################
# для рассчета ширины столбцов при печати mem и zram
# for calculating the column width when printing mem and zram
mem_len = len(str(round(mem_total / 1024.0)))
rate_mem = rate_mem * 1048576
@ -1233,11 +1256,11 @@ print('\nStart monitoring...')
##########################################################################
# цикл проверки уровней доступной памяти
while True:
# находим mem_available, swap_total, swap_free
# find mem_available, swap_total, swap_free
with open('/proc/meminfo') as f:
for n, line in enumerate(f):
if n is 2:
@ -1252,7 +1275,7 @@ while True:
# если swap_min_sigkill задан в процентах
# if swap_min_sigkill is set in percent
if swap_kill_is_percent:
swap_min_sigkill_kb = swap_total * swap_min_sigkill_percent / 100.0
@ -1263,7 +1286,7 @@ while True:
swap_min_warnings_kb = swap_total * swap_min_warnings_percent / 100.0
# находим MemUsedZram
# find MemUsedZram
disksize_sum = 0
mem_used_total_sum = 0
for dev in os.listdir('/sys/block'):
@ -1325,6 +1348,7 @@ while True:
# MEM SWAP KILL
if mem_available <= mem_min_sigkill_kb and swap_free <= swap_min_sigkill_kb:
time0 = time()
mem_info = '* MemAvailable ({} MiB, {} %) < mem_min_sigkill ({} MiB, {} %)\n Swa' \
'pFree ({} MiB, {} %) < swap_min_sigkill ({} MiB, {} %)'.format(
@ -1344,6 +1368,7 @@ while True:
# ZRAM KILL
elif mem_used_zram >= zram_max_sigkill_kb:
time0 = time()
mem_info = '* MemUsedZram ({} MiB, {} %) > zram_max_sigkill ({} MiB, {} %)'.format(
kib_to_mib(mem_used_zram),
@ -1355,8 +1380,9 @@ while True:
# MEM SWAP TERM
elif mem_available <= mem_min_sigterm_kb and swap_free <= swap_min_sigterm_kb:
time0 = time()
mem_info = '* MemAvailable ({} MiB, {} %) < mem_min_sigterm ({} MiB, {} %)\n Sw' \
mem_info = r'{' + '\n MemAvailable ({} MiB, {} %) < mem_min_sigterm ({} MiB, {} %)\n Sw' \
'apFree ({} MiB, {} %) < swap_min_sigterm ({} MiB, {} %)'.format(
kib_to_mib(mem_available),
percent(mem_available / mem_total),
@ -1379,6 +1405,7 @@ while True:
# ZRAM TERM
elif mem_used_zram >= zram_max_sigterm_kb:
time0 = time()
mem_info = '* MemUsedZram ({} MiB, {} %) > zram_max_sigter' \
'm ({} M, {} %)'.format(
@ -1390,7 +1417,7 @@ while True:
find_victim_and_send_signal(15)
# LOW MEMORY WARNINGS
elif low_memory_warnings and desktop_notifications:
elif gui_low_memory_warnings and gui_notifications:
if mem_available < mem_min_warnings_kb and swap_free < swap_min_warnings_kb + 0.1 or mem_used_zram > zram_max_warnings_kb:
warn_time_delta = time() - warn_time_now

View File

@ -5,31 +5,40 @@
The configuration includes the following sections:
* THRESHOLDS FOR SENDING SIGNALS TO VICTIMS
* INTENSITY OF MONITORING (AND CPU USAGE)
* PREVENTION OF KILLING INNOCENT VICTIMS
* AVOID AND PREFER VICTIM NAMES VIA REGEX MATCHING
* EXECUTE THE COMMAND INSTEAD OF SENDING THE SIGTERM SIGNAL
* GUI NOTIFICATIONS: RESULTS OF PREVENTING OOM AND LOW MEMORY WARNINGS
* SELF-DEFENSE AND PREVENTING SLOWING DOWN THE PROGRAM
* OUTPUT VERBOSITY
1. Memory levels to respond to as an OOM threat
2. The frequency of checking the level of available memory
(and CPU usage)
3. The prevention of killing innocent victims
4. Impact on the badness of processes via matching their names
with regular expressions
5. The execution of a specific command instead of sending the
SIGTERM signal
6. GUI notifications:
- results of preventing OOM
- low memory warnings
7. Preventing the slowing down of the program
8. Output verbosity
Just read the description of the parameters and edit the values.
Please restart the program after editing the config.
#####################################################################
* THRESHOLDS FOR SENDING SIGNALS TO VICTIMS
1. Thresholds below which a signal should be sent to the victim
Sets the available memory levels below which SIGTERM or SIGKILL
signals are sent. The signal will be sent if MemAvailable and
SwapFree at the same time will drop below the corresponding
values. Can be specified in % (percent) and M (MiB). Valid values
are floating-point numbers from the range [0; 100] %.
SwapFree (in /proc/meminfo) at the same time will drop below the
corresponding values. Can be specified in % (percent) and M (MiB).
Valid values are floating-point numbers from the range [0; 100] %.
MemAvailable levels.
mem_min_sigterm = 9 %
mem_min_sigkill = 6 %
SwapFree levels.
swap_min_sigterm = 9 %
swap_min_sigkill = 6 %
@ -41,30 +50,26 @@ swap_min_sigkill = 6 %
Can be specified in % and M. Valid values are floating-point
numbers from the range [0; 100] %.
zram_max_sigterm = 55 %
zram_max_sigkill = 60 %
zram_max_sigterm = 50 %
zram_max_sigkill = 55 %
#####################################################################
* INTENSITY OF MONITORING (AND CPU USAGE)
2. The frequency of checking the amount of available memory
(and CPU usage)
Coefficients that affect the intensity of monitoring. Reducing
the coefficients can reduce CPU usage and increase the periods
between memory checks.
Почему три коэффициента, а не один? - Потому что скорость
наполнения свопа обычно ниже скорости наполнения RAM.
Можно для свопа задать более низкую интенсивность
мониторинга без ущерба для предотвращения нехватки памяти
и тем самым снизить нагрузку на процессор.
Why three coefficients instead of one? Because the swap fill rate
is usually lower than the RAM fill rate.
В дефолтных настройках на данной интенсивности демон работает
достаточно хорошо, успешно справляясь с резкими скачками потребления
памяти.
It is possible to set a lower intensity of monitoring for swap
without compromising to prevent OOM and thus reduce the CPU load.
Default values are well for desktop.
On servers without rapid fluctuations in memory level, the
values can be reduced.
Default values are well for desktop. On servers without rapid
fluctuations in memory levels the values can be reduced.
Valid values are positive floating-point numbers.
@ -74,20 +79,19 @@ rate_zram = 1
#####################################################################
* PREVENTION OF KILLING INNOCENT VICTIMS
3. The prevention of killing innocent victims
Минимальное значение oom_score, которым должен обладать
процесс для того, чтобы ему был отправлен сигнал.
Позволяет предотвратить убийство невиновных если что-то
пойдет не так. Может min_badness с учетом списков?
пойдет не так.
Valid values are integers from the range [0; 1000].
oom_score_min = 10
min_badness = 10
Минимальная задержка после отправки соответствующих сигналов
для предотвращения риска убийства сразу множества процессов.
Должно быть неотрицательным числом.
Valid values are non-negative floating-point numbers.
@ -104,6 +108,7 @@ min_delay_after_sigkill = 3
Enabling the option requires root privileges.
Valid values are True and False.
Values are case sensitive.
decrease_oom_score_adj = False
@ -113,75 +118,79 @@ oom_score_adj_max = 20
#####################################################################
* AVOID AND PREFER VICTIM NAMES VIA REGEX MATCHING
4. Impact on the badness of processes via matching their names
with regular expressions.
Можно задать регулярные выражения (Perl-compatible regular
expressions), которые будут использоваться для сопоставления с
именами процессов для влияния на их badness.
See https://en.wikipedia.org/wiki/Regular_expression and
https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions
Включение этой опции замедляет поиск жертвы, так как
имена всех процессов сравниваются с заданными regex-паттернами.
Enabling this option slows down the search for the victim
because the names of all processes are compared with the
specified regex patterns.
Valid values are True and False.
use_regex_lists = False
regex_matching = False
Badness процессов, имена которых соответствуют preferlist_regex,
будут рассчитываться по формуле
badness = (oom_score + 1) * preferlist_factor
Badness of processes whose names correspond to prefer_regex will
be calculated by the following formula:
badness = (oom_score + 1) * prefer_factor
preferlist_regex = tail|python3
prefer_regex = tail|python3
Valid values are floating-point numbers from the range [1; 1000].
preferlist_factor = 3
prefer_factor = 3
Список нежелательных для убийства процессов.
Badness of processes whose names correspond to avoid_regex will
be calculated by the following formula:
badness = oom_score / avoid_factor
Badness процессов, имена которых соответствуют avoidlist_regex,
будут рассчитываться по формуле
badness = oom_score / avoidlist_factor
avoidlist_regex = Xorg|sshd
avoid_regex = Xorg|sshd
Valid values are floating-point numbers from the range [1; 1000].
avoidlist_factor = 4
avoid_factor = 3
#####################################################################
* EXECUTE THE COMMAND INSTEAD OF SENDING THE SIGTERM SIGNAL
5. The execution of a specific command instead of sending the
SIGTERM signal.
Для процессов с определенным именем можно задать команду,
которая будет выполняться вместо отправки сигнала SIGTERM
процессу с соответствующим именем.
For processes with a specific name you can specify a command to
run instead of sending the SIGTERM signal.
Например, если процесс запущен как демон, то вместо
отправки SIGTERM можно выполнить команду перезапуска.
For example, if the process is running as a daemon, you can run
the restart command instead of sending SIGTERM.
Valid values are True and False.
execute_the_command = False
Длина имени процесса не должна превышать 15 символов.
Синтаксис таков: строки, начинающиеся с **, считаются строками,
содержащими имена процессов и соотвестствующие команды для
перезапуска этих процессов. После имени процесса через двойное
двоеточие (::) следует команда.
Амперсанд (&) в конце команды позволит nohang продолжить работу
не дожидаясь окончания выполнения команды.
The length of the process name can't exceed 15 characters.
The syntax is as follows: lines starting with ** are considered
as the lines containing names of processes and corresponding
commands. After a name of process the double colon (::) follows.
And then follows the command that will be executed if the
specified process is selected as a victim.
The ampersand (&) at the end of the command will allow nohang to
continue runing without waiting for the end of the command
execution.
For example:
** mysqld :: systemctl restart mariadb.service &
** php-fpm7.0 :: systemctl restart php7.0-fpm.service &
** php-fpm7.0 :: systemctl restart php7.0-fpm.service
** processname :: some command
Extra sleep time after executing the command (in addition to
min_sleep_after_sigterm).
#####################################################################
* GUI NOTIFICATIONS:
* RESULTS OF PREVENTING OOM
* LOW MEMORY WARNINGS
6. GUI notifications:
- results of preventing OOM
- low memory warnings
Включение этой опции требует наличия notify-send в системе.
В Debian/Ubuntu это обеспечивается установкой пакета
@ -192,7 +201,7 @@ execute_the_command = False
See also wiki.archlinux.org/index.php/Desktop_notifications
Valid values are True and False.
desktop_notifications = False
gui_notifications = False
Additional options for notify-send.
See `notify-send --help` and read `man notify-send`
@ -213,7 +222,7 @@ root_display = :0
Для работы опции должны быть включены десктопные уведомления.
Valid values are True and False.
low_memory_warnings = False
gui_low_memory_warnings = True
Минимальное время между отправками уведомлений в секундах.
Valid values are floating-point numbers from the range [1; 300].
@ -238,32 +247,32 @@ zram_max_warnings = 40 %
#####################################################################
* SELF-DEFENSE AND PREVENTING SLOWING DOWN THE PROGRAM
7. Preventing the slowing down of the program
True - заблокировать процесс в памяти для запрета его своппинга.
False - не блокировать.
mlockall() lock ... all of the calling process's virtual address
space into RAM, preventing that memory from being paged to the
swap area. - `man mlockall`
В Fedora 28 значение True вызывает увеличение потребления
памяти процессом на 200 MiB, в Debian 8 и 9 такой проблемы нет.
It is disabled by default because the value mlockall = True in
Fedora 28 causes the process to increase memory consumption by
200 MiB. On Debian 8 and 9 there is no such problem.
mlockall = False
Установка отрицательных значений self_nice и self_oom_score_adj
Установка отрицательных значений niceness и oom_score_adj
требует наличия root прав.
Установка отрицательного self_nice повышает приоритет процесса.
Установка отрицательного niceness повышает приоритет процесса.
Valid values are integers from the range [-20; 19].
self_nice = -15
niceness = -15
# -> niceness
Set oom_score_adj for the process.
Set oom_score_adj for the nohang process.
Valid values are integers from the range [-1000; 1000].
Setting the values to -1000 will prohibit suicide.
self_oom_score_adj = -100
oom_score_adj = -100
Read `man ionice` to understand the following parameters.
Setting the True value requires the root privileges.
@ -279,11 +288,10 @@ realtime_ionice_classdata = 5
#####################################################################
* STANDARD OUTPUT VERBOSITY
8. Output verbosity
Display the configuration when the program starts.
Valid values are True and False.
Values are case sensitive!
print_config = False