update translations and cosmetic changes

This commit is contained in:
Alexey Avramov 2018-07-21 05:31:45 +09:00
parent 84c13e9e90
commit efee639e15
3 changed files with 282 additions and 234 deletions

View File

@ -1,19 +1,32 @@
The No Hang Daemon # Nohang
==================
`Nohang` is a highly configurable daemon for Linux which is able to correctly prevent out of memory conditions. `Nohang` is a highly configurable daemon for Linux which is able to correctly prevent out of memory conditions and save disk cache.
### What is the problem? ## What is the problem?
OOM killer doesn't prevent OOM conditions. OOM killer doesn't prevent OOM conditions. And OOM conditions may cause loss disk cache, [freezes](https://en.wikipedia.org/wiki/Hang_(computing)), [livelocks](https://en.wikipedia.org/wiki/Deadlock#Livelock) and killing multiple processes.
### Solutions "How do I prevent Linux from freezing when out of memory?
- Use of [earlyoom](https://github.com/rfjakob/earlyoom). This is a simple OOM preventer written in C. Today I (accidentally) ran some program on my Linux box that quickly used a lot of memory. My system froze, became unresponsive and thus I was unable to kill the offender.
- Use of nohang. This is an advanced OOM preventer written in Python.
### Some features How can I prevent this in the future? Can't it at least keep a responsive core or something running?"
[serverfault](https://serverfault.com/questions/390623/how-do-i-prevent-linux-from-freezing-when-out-of-memory)
"With or without swap it still freezes before the OOM killer gets run automatically. This is really a kernel bug that should be fixed (i.e. run OOM killer earlier, before dropping all disk cache). Unfortunately kernel developers and a lot of other folk fail to see the problem. Common suggestions such as disable/enable swap, buy more RAM, run less processes, set limits etc. do not address the underlying problem that the kernel's low memory handling sucks camel's balls."
[serverfault](https://serverfault.com/questions/390623/how-do-i-prevent-linux-from-freezing-when-out-of-memory)
Also look at "Why are low memory conditions handled so badly?" [r/linux](https://www.reddit.com/r/linux/comments/56r4xj/why_are_low_memory_conditions_handled_so_badly/) - discussion with 480+ posts.
## Solutions
- Use of [earlyoom](https://github.com/rfjakob/earlyoom). This is a simple and lightweight OOM preventer written in C.
- Use of [oomd](https://github.com/facebookincubator/oomd). This is a userspace OOM killer for linux systems whitten in C++ and developed by Facebook.
- Use of nohang.
## Some features
- convenient configuration with a well commented config file (there are 35 parameters in the config) - convenient configuration with a well commented config file (there are 35 parameters in the config)
- `SIGKILL` and `SIGTERM` as signals that can be sent to the victim - `SIGKILL` and `SIGTERM` as signals that can be sent to the victim
@ -24,7 +37,7 @@ OOM killer doesn't prevent OOM conditions.
- possibility of restarting processes via command like `systemctl restart something` if the process is selected as a victim - possibility of restarting processes via command like `systemctl restart something` if the process is selected as a victim
- look at the [config](https://github.com/hakavlad/nohang/blob/master/nohang.conf) to find more - look at the [config](https://github.com/hakavlad/nohang/blob/master/nohang.conf) to find more
### Demo ## Demo
[Video](https://youtu.be/DefJBaKD7C8): nohang prevents OOM after the command `while true; do tail /dev/zero; done` has been executed. [Video](https://youtu.be/DefJBaKD7C8): nohang prevents OOM after the command `while true; do tail /dev/zero; done` has been executed.
@ -58,40 +71,40 @@ MemAvail: 1535 M, 26.1 %
``` ```
And demo: https://youtu.be/5d6UovJzK8k And demo: https://youtu.be/5d6UovJzK8k
### Requirements ## Requirements
- `Linux 3.14+` (because the MemAvailable parameter appeared in /proc/meminfo since kernel version 3.14) and `Python 3.4+` (compatibility with earlier versions was not tested) for basic usage - `Linux 3.14+` (because the MemAvailable parameter appeared in /proc/meminfo since kernel version 3.14) and `Python 3.4+` (compatibility with earlier versions was not tested) for basic usage
- `libnotify` (Fedora, Arch) or `libnotify-bin` (Debian, Ubuntu) for desktop notifications and `sudo` for desktop notifications as root - `libnotify` (Fedora, Arch) or `libnotify-bin` (Debian, Ubuntu) for desktop notifications and `sudo` for desktop notifications as root
### Memory and CPU usage ## Memory and CPU usage
- VmRSS is 10 — 13.5 MiB depending on the settings - VmRSS is 10 — 13.5 MiB depending on the settings
- CPU usage depends on the level of available memory (the frequency of memory status checks increases as the amount of available memory decreases) and monitoring intensity (can be changed by user via config) - CPU usage depends on the level of available memory (the frequency of memory status checks increases as the amount of available memory decreases) and monitoring intensity (can be changed by user via config)
### Status ## Status
The program is unstable and some fixes are required before the first stable version will be released (need documentation, translation, review and some optimisation). The program is unstable and some fixes are required before the first stable version will be released (need documentation, translation, review and some optimisation).
### Download ## Download
```bash ```bash
git clone https://github.com/hakavlad/nohang.git git clone https://github.com/hakavlad/nohang.git
cd nohang cd nohang
``` ```
### Installation and start for systemd users ## Installation and start for systemd users
```bash ```bash
sudo ./install.sh sudo ./install.sh
``` ```
### Purge ## Purge
```bash ```bash
sudo ./purge.sh sudo ./purge.sh
``` ```
### Command line options ## Command line options
``` ```
./nohang -h ./nohang -h
@ -104,7 +117,7 @@ optional arguments:
./nohang.conf, /etc/nohang/nohang.conf ./nohang.conf, /etc/nohang/nohang.conf
``` ```
### How to configure nohang ## How to configure nohang
The program can be configured by editing the [config file](https://github.com/hakavlad/nohang/blob/master/nohang.conf). The configuration includes the following sections: The program can be configured by editing the [config file](https://github.com/hakavlad/nohang/blob/master/nohang.conf). The configuration includes the following sections:
@ -119,7 +132,7 @@ The program can be configured by editing the [config file](https://github.com/ha
Just read the description of the parameters and edit the values. Please restart nohang to apply changes. Default path to the config arter installing via `./install.sh` is `/etc/nohang/nohang.conf`. Just read the description of the parameters and edit the values. Please restart nohang to apply changes. Default path to the config arter installing via `./install.sh` is `/etc/nohang/nohang.conf`.
### Feedback ## Feedback
Please create [issues](https://github.com/hakavlad/nohang/issues). Use cases, feature requests and any questions are welcome. Please create [issues](https://github.com/hakavlad/nohang/issues). Use cases, feature requests and any questions are welcome.

293
nohang
View File

@ -6,22 +6,20 @@ import os
from operator import itemgetter from operator import itemgetter
from time import sleep, time from time import sleep, time
from argparse import ArgumentParser from argparse import ArgumentParser
from subprocess import Popen # from subprocess import Popen
sig_dict = {9: 'SIGKILL', 15: 'SIGTERM'} sig_dict = {9: 'SIGKILL', 15: 'SIGTERM'}
# директория, в которой запущен скрипт # directory where the script is running
cd = os.getcwd() cd = os.getcwd()
# где искать конфиг, если не указан через опцию -c/--config # where to look for a config if not specified via the -c/--config option
default_configs = ( default_configs = (cd + '/nohang.conf', '/etc/nohang/nohang.conf')
cd + '/nohang.conf',
'/etc/nohang/nohang.conf'
)
# universal message if config is invalid # universal message if config is invalid
conf_err_mess = '\nSet up the path to the valid config file with -c/--confi' \ conf_err_mess = '\nSet up the path to the valid conf' \
'g option!\nExit' 'ig file with -c/--config option!\nExit'
# означает, что при задани zram disksize = 10000M доступная память # означает, что при задани zram disksize = 10000M доступная память
# уменьшится на 42M # уменьшится на 42M
@ -31,6 +29,9 @@ conf_err_mess = '\nSet up the path to the valid config file with -c/--confi' \
# ("zram uses about 0.1% of the size of the disk" # ("zram uses about 0.1% of the size of the disk"
# - https://www.kernel.org/doc/Documentation/blockdev/zram.txt), # - https://www.kernel.org/doc/Documentation/blockdev/zram.txt),
# но это утверждение противоречит опытным данным # но это утверждение противоречит опытным данным
# zram_disksize_factor = deltaMemAvailavle / disksize
# found experimentally
zram_disksize_factor = 0.0042 zram_disksize_factor = 0.0042
name_strip_string = '\'"`\\!-$' name_strip_string = '\'"`\\!-$'
@ -54,16 +55,17 @@ def string_to_int_convert_test(string):
return None return None
# извлечение праметра из словаря конфига, возврат str # extracting the parameter from the config dictionary, str return
def conf_parse_string(param): def conf_parse_string(param):
if param in config_dict: if param in config_dict:
return config_dict[param].strip() return config_dict[param].strip()
else: else:
print('{} not in config\nExit'.format(param)) print('All the necessary parameters must be in the config')
print('There is no "{}" parameter in the config'.format(param))
exit() exit()
# извлечение праметра из словаря конфига, возврат bool # extracting the parameter from the config dictionary, bool return
def conf_parse_bool(param): def conf_parse_bool(param):
if param in config_dict: if param in config_dict:
param_str = config_dict[param] param_str = config_dict[param]
@ -72,22 +74,20 @@ def conf_parse_bool(param):
elif param_str == 'False': elif param_str == 'False':
return False return False
else: else:
print('Invalid {} value {} (shou' \ print('Invalid value of the "{}" parameter.'.format(param_str))
'ld be True or False)\nExit'.format(param, param_str)) print('Valid values are True and False.')
print('Exit')
exit() exit()
else: else:
print('{} not in config\nExit'.format(param)) print('All the necessary parameters must be in the config')
print('There is no "{}" parameter in the config'.format(param_str))
exit() exit()
def func_decrease_oom_score_adj(oom_score_adj_max): def func_decrease_oom_score_adj(oom_score_adj_max):
# цикл для наполнения oom_list
for i in os.listdir('/proc'): for i in os.listdir('/proc'):
# пропускаем элементы, не состоящие только из цифр
if i.isdigit() is not True: if i.isdigit() is not True:
continue continue
try: try:
oom_score_adj = int(rline1('/proc/' + i + '/oom_score_adj')) oom_score_adj = int(rline1('/proc/' + i + '/oom_score_adj'))
if oom_score_adj > oom_score_adj_max: if oom_score_adj > oom_score_adj_max:
@ -99,14 +99,14 @@ def func_decrease_oom_score_adj(oom_score_adj_max):
pass pass
# чтение первой строки файла # read 1st line
def rline1(path): def rline1(path):
with open(path) as f: with open(path) as f:
for line in f: for line in f:
return line[:-1] return line[:-1]
# запись в файл # write in file
def write(path, string): def write(path, string):
with open(path, 'w') as f: with open(path, 'w') as f:
f.write(string) f.write(string)
@ -128,12 +128,12 @@ def just_percent_swap(num):
return str(round(num * 100, 1)).rjust(5, ' ') return str(round(num * 100, 1)).rjust(5, ' ')
# K -> M, выравнивание по правому краю # KiB to MiB, right alignment
def human(num, lenth): def human(num, lenth):
return str(round(num / 1024)).rjust(lenth, ' ') return str(round(num / 1024)).rjust(lenth, ' ')
# возвращает disksize и mem_used_total по zram id # return str with amount of bytes
def zram_stat(zram_id): def zram_stat(zram_id):
try: try:
disksize = rline1('/sys/block/' + zram_id + '/disksize') disksize = rline1('/sys/block/' + zram_id + '/disksize')
@ -153,7 +153,7 @@ def zram_stat(zram_id):
return disksize, mem_used_total # BYTES, str return disksize, mem_used_total # BYTES, str
# имя через пид # return process name
def pid_to_name(pid): def pid_to_name(pid):
try: try:
with open('/proc/' + pid + '/status') as f: with open('/proc/' + pid + '/status') as f:
@ -166,7 +166,7 @@ def pid_to_name(pid):
def send_notify_warn(): def send_notify_warn():
# текст отправляемого уведомления
if mem_used_zram > 0: if mem_used_zram > 0:
info = '"<i>MemAvailable:</i> <b>{} MiB</b>\n<i>SwapFree:</i> <b>{} MiB</b>\n<i>MemUsedZram:</i> <b>{} MiB</b>" &'.format( info = '"<i>MemAvailable:</i> <b>{} MiB</b>\n<i>SwapFree:</i> <b>{} MiB</b>\n<i>MemUsedZram:</i> <b>{} MiB</b>" &'.format(
kib_to_mib(mem_available), kib_to_mib(mem_available),
@ -229,8 +229,6 @@ def sleep_after_send_signal(signal):
def find_victim_and_send_signal(signal): def find_victim_and_send_signal(signal):
time0 = time()
print(mem_info) print(mem_info)
# выставляем потолок для oom_score_adj всех процессов # выставляем потолок для oom_score_adj всех процессов
@ -240,7 +238,7 @@ def find_victim_and_send_signal(signal):
# получаем список процессов ((pid, badness)) # получаем список процессов ((pid, badness))
oom_list = [] oom_list = []
if use_regex_lists: if regex_matching:
for pid in os.listdir('/proc'): for pid in os.listdir('/proc'):
if pid.isdigit() is not True: if pid.isdigit() is not True:
@ -250,16 +248,16 @@ def find_victim_and_send_signal(signal):
oom_score = int(rline1('/proc/' + pid + '/oom_score')) oom_score = int(rline1('/proc/' + pid + '/oom_score'))
name = pid_to_name(pid) name = pid_to_name(pid)
res = fullmatch(avoidlist_regex, name) res = fullmatch(avoid_regex, name)
if res is not None: if res is not None:
# тут уже получаем badness # тут уже получаем badness
oom_score = int(oom_score / avoidlist_factor) oom_score = int(oom_score / avoid_factor)
print(' {} (Pid: {}, Badness {}) matches with avoidlist_regex'.format(name, pid, oom_score)), print(' {} (Pid: {}, Badness {}) matches with avoid_regex'.format(name, pid, oom_score)),
res = fullmatch(preferlist_regex, name) res = fullmatch(prefer_regex, name)
if res is not None: if res is not None:
oom_score = int((oom_score + 1) * preferlist_factor) oom_score = int((oom_score + 1) * prefer_factor)
print(' {} (Pid: {}, Badness {}) matches with preferlist_regex'.format(name, pid, oom_score)), print(' {} (Pid: {}, Badness {}) matches with prefer_regex'.format(name, pid, oom_score)),
except FileNotFoundError: except FileNotFoundError:
oom_score = 0 oom_score = 0
@ -287,7 +285,7 @@ def find_victim_and_send_signal(signal):
# получаем максимальный oom_score # получаем максимальный oom_score
oom_score = pid_tuple_list[1] oom_score = pid_tuple_list[1]
if oom_score >= oom_score_min: if oom_score >= min_badness:
# пытаемся отправить сигнал найденной жертве # пытаемся отправить сигнал найденной жертве
@ -326,19 +324,24 @@ def find_victim_and_send_signal(signal):
else: else:
try: try: # SUCCESS -> RESPONSE TIME
os.kill(int(pid), signal) os.kill(int(pid), signal)
success_time = time() success_time = time()
delta_success = success_time - time0 delta_success = success_time - time0
send_result = ' Success; reaction time: {} ms'.format(round(delta_success * 1000)) send_result = ' Success; response time: {} ms\n'.format(round(delta_success * 1000)) + r'}'
if desktop_notifications: if gui_notifications:
send_notify(signal, name, pid, oom_score, vm_rss, vm_swap) send_notify(signal, name, pid, oom_score, vm_rss, vm_swap)
except FileNotFoundError: except FileNotFoundError:
send_result = ' No such process' success_time = time()
delta_success = success_time - time0
send_result = ' No such process; response time: {} ms'.format(round(delta_success * 1000))
except ProcessLookupError: except ProcessLookupError:
send_result = ' No such process' success_time = time()
delta_success = success_time - time0
send_result = ' No such process; response time: {} ms'.format(round(delta_success * 1000))
try_to_send = ' Preventing OOM: trying to send the {} signal to {},\n Pid: {}, Badness: {}, VmRSS: {} MiB, VmSwap: {} MiB'.format(sig_dict[signal], name, pid, oom_score, vm_rss, vm_swap) try_to_send = ' Preventing OOM: trying to send the {} signal to {},\n Pid: {}, Badness: {}, VmRSS: {} MiB, VmSwap: {} MiB'.format(sig_dict[signal], name, pid, oom_score, vm_rss, vm_swap)
@ -347,8 +350,12 @@ def find_victim_and_send_signal(signal):
else: else:
badness_is_too_small = ' oom_score {} < oom_score_min {}'.format( success_time = time()
oom_score, oom_score_min) delta_success = success_time - time0
badness_is_too_small = ' oom_score {} < min_badness {}; response time: {} ms'.format(
oom_score, min_badness, round(delta_success * 1000))
print(badness_is_too_small) print(badness_is_too_small)
@ -391,7 +398,7 @@ for s in mem_list:
mem_list_names.append(s.split(':')[0]) mem_list_names.append(s.split(':')[0])
if mem_list_names[2] != 'MemAvailable': if mem_list_names[2] != 'MemAvailable':
print('Your Linux kernel is too old, 3.14+ requie\nExit') print('Your Linux kernel is too old, Linux 3.14+ requie\nExit')
exit() exit()
swap_total_index = mem_list_names.index('SwapTotal') swap_total_index = mem_list_names.index('SwapTotal')
@ -455,7 +462,7 @@ print(config)
########################################################################## ##########################################################################
# парсинг конфига с получением словаря параметров # parsing the config with obtaining the parameters dictionary
# conf_parameters_dict # conf_parameters_dict
# conf_restart_dict # conf_restart_dict
@ -463,10 +470,10 @@ print(config)
try: try:
with open(config) as f: with open(config) as f:
# словарь с параметрами конфига # dictionary with config options
config_dict = dict() config_dict = dict()
# словарь с именами и командами для параметра execute_the_command # dictionary with names and commands for the parameter execute_the_command
etc_dict = dict() etc_dict = dict()
for line in f: for line in f:
@ -487,7 +494,7 @@ try:
etc_name = a[0].strip() etc_name = a[0].strip()
etc_command = a[1].strip() etc_command = a[1].strip()
if len(etc_name) > 15: if len(etc_name) > 15:
print('инвалид конфиг, длина имени процесса не должна превышать 15 символов\nExit') print('Invalid config, the length of the process name must not exceed 15 characters\nExit')
exit() exit()
etc_dict[etc_name] = etc_command etc_dict[etc_name] = etc_command
@ -506,9 +513,9 @@ except IndexError:
########################################################################## ##########################################################################
# извлечение параметров из словаря # extracting parameters from the dictionary
# проверка наличия всех необходимых параметров # check for all necessary parameters
# валидация всех параметров # validation of all parameters
print_config = conf_parse_bool('print_config') print_config = conf_parse_bool('print_config')
@ -523,47 +530,59 @@ print_sleep_periods = conf_parse_bool('print_sleep_periods')
realtime_ionice = conf_parse_bool('realtime_ionice') realtime_ionice = conf_parse_bool('realtime_ionice')
if 'realtime_ionice_classdata' in config_dict: if 'realtime_ionice_classdata' in config_dict:
realtime_ionice_classdata = string_to_int_convert_test( realtime_ionice_classdata = string_to_int_convert_test(
config_dict['realtime_ionice_classdata']) config_dict['realtime_ionice_classdata'])
if realtime_ionice_classdata is None: if realtime_ionice_classdata is None:
print('Invalid realtime_ionice_classdata value, not integer\nExit') print('Invalid value of the "realtime_ionice_classdata" parameter.')
print('Valid values are integers from the range [0; 7].')
print('Exit')
exit() exit()
if realtime_ionice_classdata < 0 or realtime_ionice_classdata > 7: if realtime_ionice_classdata < 0 or realtime_ionice_classdata > 7:
print('Invalid realtime_ionice_classdata value\nExit') print('Invalid value of the "realtime_ionice_classdata" parameter.')
print('Valid values are integers from the range [0; 7].')
print('Exit')
exit() exit()
else: else:
print('realtime_ionice_classdata not in config\nExit') print('All the necessary parameters must be in the config')
print('There is no "realtime_ionice_classdata" parameter in the config')
exit() exit()
mlockall = conf_parse_bool('mlockall') mlockall = conf_parse_bool('mlockall')
if 'self_nice' in config_dict: if 'niceness' in config_dict:
self_nice = string_to_int_convert_test(config_dict['self_nice']) niceness = string_to_int_convert_test(config_dict['niceness'])
if self_nice is None: if niceness is None:
print('Invalid self_nice value, not integer\nExit') print('Invalid niceness value, not integer\nExit')
exit() exit()
if self_nice < -20 or self_nice > 19: if niceness < -20 or niceness > 19:
print('Недопустимое значение self_nice\nExit') print('Недопустимое значение niceness\nExit')
exit() exit()
else: else:
print('self_nice not in config\nExit') print('niceness not in config\nExit')
exit() exit()
if 'self_oom_score_adj' in config_dict: if 'oom_score_adj' in config_dict:
self_oom_score_adj = string_to_int_convert_test( oom_score_adj = string_to_int_convert_test(
config_dict['self_oom_score_adj']) config_dict['oom_score_adj'])
if self_oom_score_adj is None: if oom_score_adj is None:
print('Invalid self_oom_score_adj value, not integer\nExit') print('Invalid oom_score_adj value, not integer\nExit')
exit() exit()
if self_oom_score_adj < -1000 or self_oom_score_adj > 1000: if oom_score_adj < -1000 or oom_score_adj > 1000:
print('Недопустимое значение self_oom_score_adj\nExit') print('Недопустимое значение oom_score_adj\nExit')
exit() exit()
else: else:
print('self_oom_score_adj not in config\nExit') print('oom_score_adj not in config\nExit')
exit() exit()
@ -813,17 +832,17 @@ else:
exit() exit()
if 'oom_score_min' in config_dict: if 'min_badness' in config_dict:
oom_score_min = string_to_int_convert_test( min_badness = string_to_int_convert_test(
config_dict['oom_score_min']) config_dict['min_badness'])
if oom_score_min is None: if min_badness is None:
print('Invalid oom_score_min value, not integer\nExit') print('Invalid min_badness value, not integer\nExit')
exit() exit()
if oom_score_min < 0 or oom_score_min > 1000: if min_badness < 0 or min_badness > 1000:
print('Недопустимое значение oom_score_min\nExit') print('Недопустимое значение min_badness\nExit')
exit() exit()
else: else:
print('oom_score_min not in config\nExit') print('min_badness not in config\nExit')
exit() exit()
@ -844,10 +863,10 @@ else:
exit() exit()
if 'desktop_notifications' in config_dict: if 'gui_notifications' in config_dict:
desktop_notifications = config_dict['desktop_notifications'] gui_notifications = config_dict['gui_notifications']
if desktop_notifications == 'True': if gui_notifications == 'True':
desktop_notifications = True gui_notifications = True
users_dict = dict() users_dict = dict()
with open('/etc/passwd') as f: with open('/etc/passwd') as f:
for line in f: for line in f:
@ -855,15 +874,15 @@ if 'desktop_notifications' in config_dict:
username = line_list[0] username = line_list[0]
uid = line_list[2] uid = line_list[2]
users_dict[uid] = username users_dict[uid] = username
elif desktop_notifications == 'False': elif gui_notifications == 'False':
desktop_notifications = False gui_notifications = False
else: else:
print('Invalid desktop_notifications value {} (shoul' \ print('Invalid gui_notifications value {} (shoul' \
'd be True or False)\nExit'.format( 'd be True or False)\nExit'.format(
desktop_notifications)) gui_notifications))
exit() exit()
else: else:
print('desktop_notifications not in config\nExit') print('gui_notifications not in config\nExit')
exit() exit()
@ -873,47 +892,47 @@ notify_options = conf_parse_string('notify_options')
root_display = conf_parse_string('root_display') root_display = conf_parse_string('root_display')
use_regex_lists = conf_parse_bool('use_regex_lists') regex_matching = conf_parse_bool('regex_matching')
if use_regex_lists: if regex_matching:
from re import fullmatch from re import fullmatch
preferlist_regex = conf_parse_string('preferlist_regex') prefer_regex = conf_parse_string('prefer_regex')
if 'preferlist_factor' in config_dict: if 'prefer_factor' in config_dict:
preferlist_factor = string_to_float_convert_test(config_dict['preferlist_factor']) prefer_factor = string_to_float_convert_test(config_dict['prefer_factor'])
if preferlist_factor is None: if prefer_factor is None:
print('Invalid preferlist_factor value, not float\nExit') print('Invalid prefer_factor value, not float\nExit')
exit() exit()
if preferlist_factor < 1 and preferlist_factor > 1000: if prefer_factor < 1 and prefer_factor > 1000:
print('preferlist_factor должен быть в диапазоне [1; 1000]\nExit') print('prefer_factor должен быть в диапазоне [1; 1000]\nExit')
exit() exit()
else: else:
print('preferlist_factor not in config\nExit') print('prefer_factor not in config\nExit')
exit() exit()
avoidlist_regex = conf_parse_string('avoidlist_regex') avoid_regex = conf_parse_string('avoid_regex')
if 'avoidlist_factor' in config_dict: if 'avoid_factor' in config_dict:
avoidlist_factor = string_to_float_convert_test(config_dict['avoidlist_factor']) avoid_factor = string_to_float_convert_test(config_dict['avoid_factor'])
if avoidlist_factor is None: if avoid_factor is None:
print('Invalid avoidlist_factor value, not float\nExit') print('Invalid avoid_factor value, not float\nExit')
exit() exit()
if avoidlist_factor < 1 and avoidlist_factor > 1000: if avoid_factor < 1 and avoid_factor > 1000:
print('avoidlist_factor должен быть в диапазоне [1; 1000]\nExit') print('avoid_factor должен быть в диапазоне [1; 1000]\nExit')
exit() exit()
else: else:
print('avoidlist_factor not in config\nExit') print('avoid_factor not in config\nExit')
exit() exit()
low_memory_warnings = conf_parse_bool('low_memory_warnings') gui_low_memory_warnings = conf_parse_bool('gui_low_memory_warnings')
if 'min_time_between_warnings' in config_dict: if 'min_time_between_warnings' in config_dict:
@ -1077,22 +1096,22 @@ else:
# повышаем приоритет # повышаем приоритет
try: try:
os.nice(self_nice) os.nice(niceness)
self_nice_result = 'OK' niceness_result = 'OK'
except PermissionError: except PermissionError:
self_nice_result = 'Fail' niceness_result = 'Fail'
pass pass
# возможность запрета самоубийства # возможность запрета самоубийства
try: try:
with open('/proc/self/oom_score_adj', 'w') as file: with open('/proc/self/oom_score_adj', 'w') as file:
file.write('{}\n'.format(self_oom_score_adj)) file.write('{}\n'.format(oom_score_adj))
self_oom_score_adj_result = 'OK' oom_score_adj_result = 'OK'
except PermissionError: except PermissionError:
pass pass
self_oom_score_adj_result = 'Fail' oom_score_adj_result = 'Fail'
except OSError: except OSError:
self_oom_score_adj_result = 'Fail' oom_score_adj_result = 'Fail'
pass pass
# запрет своппинга процесса # запрет своппинга процесса
@ -1111,6 +1130,10 @@ self_uid = os.geteuid()
self_pid = os.getpid() self_pid = os.getpid()
if self_uid == 0: if self_uid == 0:
root = True root = True
decrease_res = 'OK' decrease_res = 'OK'
@ -1140,11 +1163,11 @@ if print_config:
print('\nII. SELF-DEFENSE [displaying these options need fix]') print('\nII. SELF-DEFENSE [displaying these options need fix]')
print('mlockall: {} ({})'.format(mlockall, mla_res)) print('mlockall: {} ({})'.format(mlockall, mla_res))
print('self_nice: {} ({})'.format( print('niceness: {} ({})'.format(
self_nice, self_nice_result niceness, niceness_result
)) ))
print('self_oom_score_adj: {} ({})'.format( print('oom_score_adj: {} ({})'.format(
self_oom_score_adj, self_oom_score_adj_result oom_score_adj, oom_score_adj_result
)) ))
print('\nIII. INTENSITY OF MONITORING') print('\nIII. INTENSITY OF MONITORING')
@ -1170,7 +1193,7 @@ if print_config:
print('\nV. PREVENTION OF KILLING INNOCENT VICTIMS') print('\nV. PREVENTION OF KILLING INNOCENT VICTIMS')
print('min_delay_after_sigterm: {}'.format(min_delay_after_sigterm)) print('min_delay_after_sigterm: {}'.format(min_delay_after_sigterm))
print('min_delay_after_sigkill: {}'.format(min_delay_after_sigkill)) print('min_delay_after_sigkill: {}'.format(min_delay_after_sigkill))
print('oom_score_min: {}'.format(oom_score_min)) print('min_badness: {}'.format(min_badness))
# False (OK) - OK не нужен когда фолс # False (OK) - OK не нужен когда фолс
print('decrease_oom_score_adj: {} ({})'.format( print('decrease_oom_score_adj: {} ({})'.format(
@ -1180,22 +1203,22 @@ if print_config:
print('oom_score_adj_max: {}'.format(oom_score_adj_max)) print('oom_score_adj_max: {}'.format(oom_score_adj_max))
print('\nVI. DESKTOP NOTIFICATIONS') print('\nVI. DESKTOP NOTIFICATIONS')
print('desktop_notifications: {}'.format(desktop_notifications)) print('gui_notifications: {}'.format(gui_notifications))
if desktop_notifications: if gui_notifications:
print('notify_options: {}'.format(notify_options)) print('notify_options: {}'.format(notify_options))
print('root_display: {}'.format(root_display)) print('root_display: {}'.format(root_display))
print('\nVII. AVOID AND PREFER VICTIM NAMES VIA REGEX') print('\nVII. AVOID AND PREFER VICTIM NAMES VIA REGEX')
print('use_regex_lists: {}'.format(use_regex_lists)) print('regex_matching: {}'.format(regex_matching))
if use_regex_lists: if regex_matching:
print('preferlist_regex: {}'.format(preferlist_regex)) print('prefer_regex: {}'.format(prefer_regex))
print('preferlist_factor: {}'.format(preferlist_factor)) print('prefer_factor: {}'.format(prefer_factor))
print('avoidlist_regex: {}'.format(avoidlist_regex)) print('avoid_regex: {}'.format(avoid_regex))
print('avoidlist_factor: {}'.format(avoidlist_factor)) print('avoid_factor: {}'.format(avoid_factor))
print('\nIX. LOW MEMORY WARNINGS') print('\nIX. LOW MEMORY WARNINGS')
print('low_memory_warnings: {}'.format(low_memory_warnings)) print('gui_low_memory_warnings: {}'.format(gui_low_memory_warnings))
if low_memory_warnings: if gui_low_memory_warnings:
print('min_time_between_warnings: {}'.format(min_time_between_warnings)) print('min_time_between_warnings: {}'.format(min_time_between_warnings))
print('mem_min_warnings: {} MiB, {} %'.format( print('mem_min_warnings: {} MiB, {} %'.format(
@ -1218,7 +1241,7 @@ if print_config:
########################################################################## ##########################################################################
# для рассчета ширины столбцов при печати mem и zram # for calculating the column width when printing mem and zram
mem_len = len(str(round(mem_total / 1024.0))) mem_len = len(str(round(mem_total / 1024.0)))
rate_mem = rate_mem * 1048576 rate_mem = rate_mem * 1048576
@ -1233,11 +1256,11 @@ print('\nStart monitoring...')
########################################################################## ##########################################################################
# цикл проверки уровней доступной памяти
while True: while True:
# находим mem_available, swap_total, swap_free # find mem_available, swap_total, swap_free
with open('/proc/meminfo') as f: with open('/proc/meminfo') as f:
for n, line in enumerate(f): for n, line in enumerate(f):
if n is 2: if n is 2:
@ -1252,7 +1275,7 @@ while True:
# если swap_min_sigkill задан в процентах # if swap_min_sigkill is set in percent
if swap_kill_is_percent: if swap_kill_is_percent:
swap_min_sigkill_kb = swap_total * swap_min_sigkill_percent / 100.0 swap_min_sigkill_kb = swap_total * swap_min_sigkill_percent / 100.0
@ -1263,7 +1286,7 @@ while True:
swap_min_warnings_kb = swap_total * swap_min_warnings_percent / 100.0 swap_min_warnings_kb = swap_total * swap_min_warnings_percent / 100.0
# находим MemUsedZram # find MemUsedZram
disksize_sum = 0 disksize_sum = 0
mem_used_total_sum = 0 mem_used_total_sum = 0
for dev in os.listdir('/sys/block'): for dev in os.listdir('/sys/block'):
@ -1325,6 +1348,7 @@ while True:
# MEM SWAP KILL # MEM SWAP KILL
if mem_available <= mem_min_sigkill_kb and swap_free <= swap_min_sigkill_kb: if mem_available <= mem_min_sigkill_kb and swap_free <= swap_min_sigkill_kb:
time0 = time()
mem_info = '* MemAvailable ({} MiB, {} %) < mem_min_sigkill ({} MiB, {} %)\n Swa' \ mem_info = '* MemAvailable ({} MiB, {} %) < mem_min_sigkill ({} MiB, {} %)\n Swa' \
'pFree ({} MiB, {} %) < swap_min_sigkill ({} MiB, {} %)'.format( 'pFree ({} MiB, {} %) < swap_min_sigkill ({} MiB, {} %)'.format(
@ -1344,6 +1368,7 @@ while True:
# ZRAM KILL # ZRAM KILL
elif mem_used_zram >= zram_max_sigkill_kb: elif mem_used_zram >= zram_max_sigkill_kb:
time0 = time()
mem_info = '* MemUsedZram ({} MiB, {} %) > zram_max_sigkill ({} MiB, {} %)'.format( mem_info = '* MemUsedZram ({} MiB, {} %) > zram_max_sigkill ({} MiB, {} %)'.format(
kib_to_mib(mem_used_zram), kib_to_mib(mem_used_zram),
@ -1355,8 +1380,9 @@ while True:
# MEM SWAP TERM # MEM SWAP TERM
elif mem_available <= mem_min_sigterm_kb and swap_free <= swap_min_sigterm_kb: elif mem_available <= mem_min_sigterm_kb and swap_free <= swap_min_sigterm_kb:
time0 = time()
mem_info = '* MemAvailable ({} MiB, {} %) < mem_min_sigterm ({} MiB, {} %)\n Sw' \ mem_info = r'{' + '\n MemAvailable ({} MiB, {} %) < mem_min_sigterm ({} MiB, {} %)\n Sw' \
'apFree ({} MiB, {} %) < swap_min_sigterm ({} MiB, {} %)'.format( 'apFree ({} MiB, {} %) < swap_min_sigterm ({} MiB, {} %)'.format(
kib_to_mib(mem_available), kib_to_mib(mem_available),
percent(mem_available / mem_total), percent(mem_available / mem_total),
@ -1379,6 +1405,7 @@ while True:
# ZRAM TERM # ZRAM TERM
elif mem_used_zram >= zram_max_sigterm_kb: elif mem_used_zram >= zram_max_sigterm_kb:
time0 = time()
mem_info = '* MemUsedZram ({} MiB, {} %) > zram_max_sigter' \ mem_info = '* MemUsedZram ({} MiB, {} %) > zram_max_sigter' \
'm ({} M, {} %)'.format( 'm ({} M, {} %)'.format(
@ -1390,7 +1417,7 @@ while True:
find_victim_and_send_signal(15) find_victim_and_send_signal(15)
# LOW MEMORY WARNINGS # LOW MEMORY WARNINGS
elif low_memory_warnings and desktop_notifications: elif gui_low_memory_warnings and gui_notifications:
if mem_available < mem_min_warnings_kb and swap_free < swap_min_warnings_kb + 0.1 or mem_used_zram > zram_max_warnings_kb: if mem_available < mem_min_warnings_kb and swap_free < swap_min_warnings_kb + 0.1 or mem_used_zram > zram_max_warnings_kb:
warn_time_delta = time() - warn_time_now warn_time_delta = time() - warn_time_now

View File

@ -5,31 +5,40 @@
The configuration includes the following sections: The configuration includes the following sections:
* THRESHOLDS FOR SENDING SIGNALS TO VICTIMS 1. Memory levels to respond to as an OOM threat
* INTENSITY OF MONITORING (AND CPU USAGE) 2. The frequency of checking the level of available memory
* PREVENTION OF KILLING INNOCENT VICTIMS (and CPU usage)
* AVOID AND PREFER VICTIM NAMES VIA REGEX MATCHING 3. The prevention of killing innocent victims
* EXECUTE THE COMMAND INSTEAD OF SENDING THE SIGTERM SIGNAL 4. Impact on the badness of processes via matching their names
* GUI NOTIFICATIONS: RESULTS OF PREVENTING OOM AND LOW MEMORY WARNINGS with regular expressions
* SELF-DEFENSE AND PREVENTING SLOWING DOWN THE PROGRAM 5. The execution of a specific command instead of sending the
* OUTPUT VERBOSITY SIGTERM signal
6. GUI notifications:
- results of preventing OOM
- low memory warnings
7. Preventing the slowing down of the program
8. Output verbosity
Just read the description of the parameters and edit the values. Just read the description of the parameters and edit the values.
Please restart the program after editing the config. Please restart the program after editing the config.
##################################################################### #####################################################################
* THRESHOLDS FOR SENDING SIGNALS TO VICTIMS 1. Thresholds below which a signal should be sent to the victim
Sets the available memory levels below which SIGTERM or SIGKILL Sets the available memory levels below which SIGTERM or SIGKILL
signals are sent. The signal will be sent if MemAvailable and signals are sent. The signal will be sent if MemAvailable and
SwapFree at the same time will drop below the corresponding SwapFree (in /proc/meminfo) at the same time will drop below the
values. Can be specified in % (percent) and M (MiB). Valid values corresponding values. Can be specified in % (percent) and M (MiB).
are floating-point numbers from the range [0; 100] %. Valid values are floating-point numbers from the range [0; 100] %.
MemAvailable levels.
mem_min_sigterm = 9 % mem_min_sigterm = 9 %
mem_min_sigkill = 6 % mem_min_sigkill = 6 %
SwapFree levels.
swap_min_sigterm = 9 % swap_min_sigterm = 9 %
swap_min_sigkill = 6 % swap_min_sigkill = 6 %
@ -41,30 +50,26 @@ swap_min_sigkill = 6 %
Can be specified in % and M. Valid values are floating-point Can be specified in % and M. Valid values are floating-point
numbers from the range [0; 100] %. numbers from the range [0; 100] %.
zram_max_sigterm = 55 % zram_max_sigterm = 50 %
zram_max_sigkill = 60 % zram_max_sigkill = 55 %
##################################################################### #####################################################################
* INTENSITY OF MONITORING (AND CPU USAGE) 2. The frequency of checking the amount of available memory
(and CPU usage)
Coefficients that affect the intensity of monitoring. Reducing Coefficients that affect the intensity of monitoring. Reducing
the coefficients can reduce CPU usage and increase the periods the coefficients can reduce CPU usage and increase the periods
between memory checks. between memory checks.
Почему три коэффициента, а не один? - Потому что скорость Why three coefficients instead of one? Because the swap fill rate
наполнения свопа обычно ниже скорости наполнения RAM. is usually lower than the RAM fill rate.
Можно для свопа задать более низкую интенсивность
мониторинга без ущерба для предотвращения нехватки памяти
и тем самым снизить нагрузку на процессор.
В дефолтных настройках на данной интенсивности демон работает It is possible to set a lower intensity of monitoring for swap
достаточно хорошо, успешно справляясь с резкими скачками потребления without compromising to prevent OOM and thus reduce the CPU load.
памяти.
Default values are well for desktop. Default values are well for desktop. On servers without rapid
On servers without rapid fluctuations in memory level, the fluctuations in memory levels the values can be reduced.
values can be reduced.
Valid values are positive floating-point numbers. Valid values are positive floating-point numbers.
@ -74,20 +79,19 @@ rate_zram = 1
##################################################################### #####################################################################
* PREVENTION OF KILLING INNOCENT VICTIMS 3. The prevention of killing innocent victims
Минимальное значение oom_score, которым должен обладать Минимальное значение oom_score, которым должен обладать
процесс для того, чтобы ему был отправлен сигнал. процесс для того, чтобы ему был отправлен сигнал.
Позволяет предотвратить убийство невиновных если что-то Позволяет предотвратить убийство невиновных если что-то
пойдет не так. Может min_badness с учетом списков? пойдет не так.
Valid values are integers from the range [0; 1000]. Valid values are integers from the range [0; 1000].
oom_score_min = 10 min_badness = 10
Минимальная задержка после отправки соответствующих сигналов Минимальная задержка после отправки соответствующих сигналов
для предотвращения риска убийства сразу множества процессов. для предотвращения риска убийства сразу множества процессов.
Должно быть неотрицательным числом.
Valid values are non-negative floating-point numbers. Valid values are non-negative floating-point numbers.
@ -104,6 +108,7 @@ min_delay_after_sigkill = 3
Enabling the option requires root privileges. Enabling the option requires root privileges.
Valid values are True and False. Valid values are True and False.
Values are case sensitive.
decrease_oom_score_adj = False decrease_oom_score_adj = False
@ -113,75 +118,79 @@ oom_score_adj_max = 20
##################################################################### #####################################################################
* AVOID AND PREFER VICTIM NAMES VIA REGEX MATCHING 4. Impact on the badness of processes via matching their names
with regular expressions.
Можно задать регулярные выражения (Perl-compatible regular See https://en.wikipedia.org/wiki/Regular_expression and
expressions), которые будут использоваться для сопоставления с https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions
именами процессов для влияния на их badness.
Включение этой опции замедляет поиск жертвы, так как Enabling this option slows down the search for the victim
имена всех процессов сравниваются с заданными regex-паттернами. because the names of all processes are compared with the
specified regex patterns.
Valid values are True and False. Valid values are True and False.
use_regex_lists = False regex_matching = False
Badness процессов, имена которых соответствуют preferlist_regex, Badness of processes whose names correspond to prefer_regex will
будут рассчитываться по формуле be calculated by the following formula:
badness = (oom_score + 1) * preferlist_factor badness = (oom_score + 1) * prefer_factor
preferlist_regex = tail|python3 prefer_regex = tail|python3
Valid values are floating-point numbers from the range [1; 1000]. Valid values are floating-point numbers from the range [1; 1000].
preferlist_factor = 3 prefer_factor = 3
Список нежелательных для убийства процессов. Badness of processes whose names correspond to avoid_regex will
be calculated by the following formula:
badness = oom_score / avoid_factor
Badness процессов, имена которых соответствуют avoidlist_regex, avoid_regex = Xorg|sshd
будут рассчитываться по формуле
badness = oom_score / avoidlist_factor
avoidlist_regex = Xorg|sshd
Valid values are floating-point numbers from the range [1; 1000]. Valid values are floating-point numbers from the range [1; 1000].
avoidlist_factor = 4 avoid_factor = 3
##################################################################### #####################################################################
* EXECUTE THE COMMAND INSTEAD OF SENDING THE SIGTERM SIGNAL 5. The execution of a specific command instead of sending the
SIGTERM signal.
Для процессов с определенным именем можно задать команду, For processes with a specific name you can specify a command to
которая будет выполняться вместо отправки сигнала SIGTERM run instead of sending the SIGTERM signal.
процессу с соответствующим именем.
Например, если процесс запущен как демон, то вместо For example, if the process is running as a daemon, you can run
отправки SIGTERM можно выполнить команду перезапуска. the restart command instead of sending SIGTERM.
Valid values are True and False. Valid values are True and False.
execute_the_command = False execute_the_command = False
Длина имени процесса не должна превышать 15 символов. The length of the process name can't exceed 15 characters.
Синтаксис таков: строки, начинающиеся с **, считаются строками, The syntax is as follows: lines starting with ** are considered
содержащими имена процессов и соотвестствующие команды для as the lines containing names of processes and corresponding
перезапуска этих процессов. После имени процесса через двойное commands. After a name of process the double colon (::) follows.
двоеточие (::) следует команда. And then follows the command that will be executed if the
Амперсанд (&) в конце команды позволит nohang продолжить работу specified process is selected as a victim.
не дожидаясь окончания выполнения команды. The ampersand (&) at the end of the command will allow nohang to
continue runing without waiting for the end of the command
execution.
For example: For example:
** mysqld :: systemctl restart mariadb.service & ** mysqld :: systemctl restart mariadb.service &
** php-fpm7.0 :: systemctl restart php7.0-fpm.service & ** php-fpm7.0 :: systemctl restart php7.0-fpm.service
** processname :: some command ** processname :: some command
Extra sleep time after executing the command (in addition to
min_sleep_after_sigterm).
##################################################################### #####################################################################
* GUI NOTIFICATIONS: 6. GUI notifications:
* RESULTS OF PREVENTING OOM - results of preventing OOM
* LOW MEMORY WARNINGS - low memory warnings
Включение этой опции требует наличия notify-send в системе. Включение этой опции требует наличия notify-send в системе.
В Debian/Ubuntu это обеспечивается установкой пакета В Debian/Ubuntu это обеспечивается установкой пакета
@ -192,7 +201,7 @@ execute_the_command = False
See also wiki.archlinux.org/index.php/Desktop_notifications See also wiki.archlinux.org/index.php/Desktop_notifications
Valid values are True and False. Valid values are True and False.
desktop_notifications = False gui_notifications = False
Additional options for notify-send. Additional options for notify-send.
See `notify-send --help` and read `man notify-send` See `notify-send --help` and read `man notify-send`
@ -213,7 +222,7 @@ root_display = :0
Для работы опции должны быть включены десктопные уведомления. Для работы опции должны быть включены десктопные уведомления.
Valid values are True and False. Valid values are True and False.
low_memory_warnings = False gui_low_memory_warnings = True
Минимальное время между отправками уведомлений в секундах. Минимальное время между отправками уведомлений в секундах.
Valid values are floating-point numbers from the range [1; 300]. Valid values are floating-point numbers from the range [1; 300].
@ -238,32 +247,32 @@ zram_max_warnings = 40 %
##################################################################### #####################################################################
* SELF-DEFENSE AND PREVENTING SLOWING DOWN THE PROGRAM 7. Preventing the slowing down of the program
True - заблокировать процесс в памяти для запрета его своппинга. mlockall() lock ... all of the calling process's virtual address
False - не блокировать. space into RAM, preventing that memory from being paged to the
swap area. - `man mlockall`
В Fedora 28 значение True вызывает увеличение потребления It is disabled by default because the value mlockall = True in
памяти процессом на 200 MiB, в Debian 8 и 9 такой проблемы нет. Fedora 28 causes the process to increase memory consumption by
200 MiB. On Debian 8 and 9 there is no such problem.
mlockall = False mlockall = False
Установка отрицательных значений self_nice и self_oom_score_adj Установка отрицательных значений niceness и oom_score_adj
требует наличия root прав. требует наличия root прав.
Установка отрицательного self_nice повышает приоритет процесса. Установка отрицательного niceness повышает приоритет процесса.
Valid values are integers from the range [-20; 19]. Valid values are integers from the range [-20; 19].
self_nice = -15 niceness = -15
# -> niceness Set oom_score_adj for the nohang process.
Set oom_score_adj for the process.
Valid values are integers from the range [-1000; 1000]. Valid values are integers from the range [-1000; 1000].
Setting the values to -1000 will prohibit suicide. Setting the values to -1000 will prohibit suicide.
self_oom_score_adj = -100 oom_score_adj = -100
Read `man ionice` to understand the following parameters. Read `man ionice` to understand the following parameters.
Setting the True value requires the root privileges. Setting the True value requires the root privileges.
@ -279,11 +288,10 @@ realtime_ionice_classdata = 5
##################################################################### #####################################################################
* STANDARD OUTPUT VERBOSITY 8. Output verbosity
Display the configuration when the program starts. Display the configuration when the program starts.
Valid values are True and False. Valid values are True and False.
Values are case sensitive!
print_config = False print_config = False