source: trunk/docs/HOWTO-client+server-troubleshooting.html@ 328

Last change on this file since 328 was 307, checked in by katerina, 14 years ago

Fix for ticket #229 (malfunction on CentOS 4.8 / gcc4), documentation update.

File size: 13.3 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2<html>
3<head>
4<title>HOWTO client+server troubleshooting</title>
5<style type="text/css">
6<!--
7
8html { background: #eee; color: #000; }
9
10body { background: #eee; color: #000; margin: 0; padding: 0;}
11
12div.body {
13 background: #fff; color: #000;
14 margin: 0 1em 0 1em; padding: 1em;
15 font-family: serif;
16 font-size: 1em; line-height: 1.2em;
17 border-width: 0 1px 0 1px;
18 border-style: solid;
19 border-color: #aaa;
20}
21
22div.block {
23 background: #b6c5f2; color: #000;
24 margin: 1em; padding: 0 1em 0 1em;
25 border-width: 1px;
26 border-style: solid;
27 border-color: #2d4488;
28}
29
30div.warnblock {
31 background: #b6c5f2; color: #000;
32 background: #ffffcc; color: #000;
33 margin: 1em; padding: 0 1em 0 1em;
34 border-width: 1px;
35 border-style: solid;
36 border-color: #FF9900;
37}
38
39table {
40 background: #F8F8F8; color: #000;
41 margin: 1em;
42 border-width: 0 0 0 1px;
43 border-style: solid;
44 border-color: #C0C0C0;
45}
46
47td {
48 border-width: 0 1px 1px 0;
49 border-style: solid;
50 border-color: #C0C0C0;
51}
52
53th {
54 background: #F8F8FF;
55 border-width: 1px 1px 2px 0;
56 border-style: solid;
57 border-color: #C0C0C0;
58}
59
60
61/* body text, headings, and rules */
62
63p { margin: 0; text-indent: 0em; margin: 0 0 0.5em 0 }
64
65h1, h2, h3, h4, h5, h6 {
66 color: #206020; background: transparent;
67 font-family: Optima, Arial, Helvetica, sans-serif;
68 font-weight: normal;
69}
70
71h1 { font-size: 1.69em; margin: 1.4em 0 0.4em 0; }
72h2 { font-size: 1.44em; margin: 1.4em 0 0.4em 0; }
73h3 { font-size: 1.21em; margin: 1.4em 0 0.4em 0; }
74h4 { font-size: 1.00em; margin: 1.4em 0 0.4em 0; }
75h5 { font-size: 0.81em; margin: 1.4em 0 0.4em 0; }
76h6 { font-size: 0.64em; margin: 1.4em 0 0.4em 0; }
77
78hr {
79 color: transparent; background: transparent;
80 height: 0px; margin: 0.6em 0;
81 border-width: 1px ;
82 border-style: solid;
83 border-color: #999;
84}
85
86/* bulleted lists and definition lists */
87
88ul { margin: 0 1em 0.6em 2em; padding: 0; }
89li { margin: 0.4em 0 0 0; }
90
91dl { margin: 0.6em 1em 0.6em 2em; }
92dt { color: #285577; }
93
94tt { color: #602020; }
95
96/* links */
97
98a.link {
99 color: #33c; background: transparent;
100 text-decoration: none;
101}
102
103a:hover {
104 color: #000; background: transparent;
105}
106
107body > a {
108 font-family: Optima, Arial, Helvetica, sans-serif;
109 font-size: 0.81em;
110}
111
112h1, h2, h3, h4, h5, h6 {
113 color: #2d5588; background: transparent;
114 font-family: Optima, Arial, Helvetica, sans-serif;
115 font-weight: normal;
116}
117
118 -->
119</style></head>
120
121<body>
122<div class="body">
123<p style="text-align: center; background: #ccc; border: 1px solid #2d5588;"><a
124 style="text-decoration: none;"
125 href="http://www.la-samhna.de/samhain/">samhain file integrity
126 scanner</a>&nbsp;|&nbsp;<a style="text-decoration: none;"
127 href="http://www.la-samhna.de/samhain/s_documentation.html">online
128 documentation</a></p>
129<br><center>
130<h1>Samhain client/server: What can go wrong, and how can you fix it ?</h1>
131</center>
132<br>
133<hr>
134<div class="warnblock">
135<ul>
136 <li>Almost all problems can only be diagnosed correctly by checking the
137 <b>server logs</b>.</li>
138 <li>
139 If the server does not write logs, <b>fix this first</b>. For debugging,
140 stop the server, then run it in the foreground with
141 <tt>yule -p info --foreground</tt>
142 <ul>
143 <li>
144 By default, the server logs to the file
145 <tt>/var/log/yule/yule_log</tt>, and since the server drops
146 root privileges on startup, the directory <tt>/var/log/yule</tt>
147 must be writable for the nonprivileged user the server runs
148 as (the first existing out of: yule, daemon, nobody).
149 </li>
150 <li>
151 Logging to the logfile must be enabled in the
152 <tt>/etc/yulerc</tt> config file (e.g. LogSeverity=mark, or
153 LogSeverity=info for enhanced verbosity).
154 </li>
155 </ul>
156 </li>
157</ul>
158</div>
159<p>
160This document aims to explain how to diagnose and fix common problems that
161may result from misunderstanding or misconfiguration when setting up
162a client/server samhain system. This document is divided in several sections
163more or less corresponding to the different stages when a client
164connects to a server. Each section starts with a brief explanation that
165should provide a basic understanding of what is going on.
166</p>
167<p>
168This document does not discuss <i>how</i> to setup a client/server (for
169this, look into the manual and/or the HOWTO-client+server).
170</p>
171
172<h2><a name="sect1">Table of Contents</a></h2>
173<p>
174<a href="#sect1">Connecting to the server</a><br>
175<a href="#sect2">Authentication</a><br>
176<a href="#sect3">Downloading config/database files</a><br>
177<a href="#sect4">Other connection problems</a><br>
178</p>
179
180<h2><a name="sect1">Connecting to the server</a></h2>
181
182<p>
183Client/server connections are always initiated from the client. The port
184is compiled in (there is a configure option to change the default).
185The default port is 49777.
186</p>
187
188<h3>Problem #1</h3>
189<p>
190The client reports: <b>Connection refused</b>. The server reports nothing.
191</p>
192<p>
193The server is down, listens on the wrong port, or network failure.
194</p>
195
196<h3>Problem #2</h3>
197<p>
198The client reports: <b>Connection error: Connection reset by peer</b>, and
199later also <b>Session key negotiation failed</b>. The server reports:
200<b>msg=&quot;Refused connection from ...&quot; subroutine=&quot;libwrap&quot;</b>.
201</p>
202<p>
203The server is compiled with libwrap (TCP Wrapper) support, and the
204client is either in <tt>/etc/hosts.deny</tt>, or you have set <i>yule: ALL</i>
205in <tt>/etc/hosts.deny</tt>, and forgot to put the client in
206<tt>/etc/hosts.allow</tt>.
207</p>
208<p>
209To fix: make proper entries in <tt>/etc/hosts.allow</tt> and/or
210<tt>/etc/hosts.deny</tt>. There is no need to restart/reload the server.
211</p>
212
213
214<h2><a name="sect2">Authentication</a></h2>
215<p>
216The client has a password that is used to authenticate to the server.
217This password is located within the binary, and is set with the
218<tt>samhain_setpwd</tt> helper application, as explained e.g. in the
219manual or in the Client+Server HOWTO.
220</p><p>
221The server has a list of clients that are allowed to connect, and the
222verifiers corresponding to the passwords of these clients.
223</p>
224<p>
225Upon successful authentication, client and server will negotiate
226a <b>session key</b> that is used for signing further messages
227from the client.
228</p>
229
230<h3>Problem #1</h3>
231
232<p>
233If the password is wrong, the client will report
234<b>Session key negotiation failed</b>. The server will
235report: <b>Invalid connection attempt: Session key mismatch</b>
236</p>
237<p>
238To fix: make sure that the password has in fact been set, that you are
239using the correct executable for the client (the one where the password is
240set), and that the entry in the server config file is the one generated
241for this password (also look out for double entries for this client).
242</p>
243
244<h3>Problem #2</h3>
245
246<p>
247If the client name (as resolved on the server) is wrong, the client
248will report
249<b>Session key negotiation failed</b>. The server will
250report: <b>Invalid connection attempt: Not in client list</b>,
251<i>and</i> it will tell you in the same error message
252what name it has inferred for the connecting
253client (example): <b>client=&quot;client.mydomain.com&quot;</b>.
254</p>
255<p>
256The fix depends on the nature of the problem. In principle, it should be
257sufficient to change the name of the client in the config file entry, which
258isn't really a solution if e.g. the server thinks the client is 'localhost'.
259</p>
260<p>
261There are two different ways to determine the client name.
262Unfortunately, judging
263from customer feedback as well from common sense, both do not work very well
264with a messed up local DNS (including /etc/hosts files) and/or
265&uuml;berparanoid or misconfigured firewalls (in case of connections
266across one).
267</p>
268<ul>
269 <li>
270 <p>
271 <i>First method: Determine client name on client, and
272 try to cross-check on server</i>
273 <p>
274 <p>
275 This does not work for a number of people because
276 <ol>
277 <li>
278 the
279 <tt>/etc/hosts</tt> file on the client machine has errors
280 (yes, there are plenty machines with a completely
281 messed up <tt>/etc/hosts</tt> file),
282 </li>
283 <li>
284 the
285 server cannot resolve the client address because the local DNS is
286 misconfigured, or
287 </li>
288 <li>
289 the client machine has multiple network interfaces, and
290 the interface used is not the one the client name resolves to.
291 </li>
292 </ol>
293 </p>
294
295 <p>
296 If the client uses the wrong interface on a multi-interface machine,
297 there is a config file option
298 <tt>SetBindAddress=</tt><i>IP address</i>
299 that allows to choose the interface the client will use for
300 outgoing connections.
301 </p>
302 <p>
303 If you want to download the config file from the server, you
304 should instead use the corresponding command line option
305 <tt>--bind-address=</tt><i>IP address</i>
306 to select the interface.
307 </p>
308
309 <p>
310 If you encounter problems, you may (1) fix your
311 <tt>/etc/hosts</tt> file(s), (2) fix your local DNS, or
312 (3) switch to the second method.
313 </p>
314 <p>
315 Error messages related to name resolving/cross-checking can be
316 suppressed by setting a
317 very low severity (lower than the logging threshold), e.g.
318 </p>
319 <p>
320 <tt>SeverityLookup=</tt><i>debug</i>
321 </p>
322 <p>
323 in the <i>Misc</i> section of the server configuration,
324 if you prefer running <i>unsafe</i> at any speed
325 instead of fixing the problem (you have been warned). Doing so will
326 allow an attacker to pose as the client.
327 </p>
328 </li>
329 <li>
330 <p><i>Second method: Use address of connecting entity as
331 known to the communication layer</i></p>
332 <p>
333 This has been dropped as default
334 long ago because it may not always be the
335 address of the client machine.
336 To enable this method, use
337 </p>
338 <p>
339 <tt>SetClientFromAccept=</tt><i>true</i>
340 </p>
341 <p>
342 in the <i>Misc</i> section of the server configuration
343 file. If the address cannot be resolved, or reverse lookup of the
344 resolved name fails, <i>no</i> error message will be issued,
345 but the numerical address will be used.
346 </p>
347 </li>
348</ul>
349
350
351<h2><a name="sect3">Downloading config/database files</a></h2>
352
353<p>
354The client does <i>not</i> tell the server the path to the requested
355file - it just tells the <em>type</em> of the file, i.e.
356either a configuration file or a database file. It is entirely the
357responsibility of the server to locate the correct file and send it.
358</p>
359<p>
360The server has a <i>data directory</i>, which by default would be
361<tt>/var/lib/yule</tt>. Here the config/database files should be placed.
362</p>
363<p>
364Configuration files: <tt>rc.</tt><i>client.mydomain.tld</i> or
365simply <tt>rc</tt>
366(this can be used as a catchall file).
367</p>
368<p>
369Database files: <tt>file.</tt><i>client.mydomain.tld</i> or
370simply <tt>file</tt>
371(this can be used as a catchall file).
372</p>
373
374<h3>Problem #1</h3>
375
376<p>
377If the server cannot access the configuration (or database) file, either
378because it does not exist or the server has no read permission, the
379client will report <b>File download failed</b>. The server will
380report: <b>File not accessible</b>, <i>and</i> it will tell you in the
381same report the path where it would have expected the file (example):
382<b>path=&quot;/var/lib/yule/rc.client.mydomain.com&quot;</b>
383</p>
384<p>
385To fix: put the file in the correct location, make sure the permissions
386are ok.
387<ul>
388 <li>
389 Note that <em>the server drops root privileges at startup</em> and
390 runs as an unprivileged user (the first existing out of:
391 yule, daemon, nobody).
392 </li>
393 <li>
394 Also remember that to access a file, at least execute permission is required
395 <em>for every directory in the path</em>.
396 </li>
397</ul>
398</p>
399
400
401<h2><a name="sect4">Other connection problems</a></h2>
402
403<p>
404The server has a table with client names and their session keys. If
405another client process accesses the server from the same host,
406it will negotiate a fresh session key for that host. As a consequence,
407the session key of the first client process will become <i>invalid</i>.
408</p>
409<p>
410Also, the server keeps track of the status of a client. If a client
411process does not announce its termination to the server, the server
412will not expect a <i>startup</i> message, and issue a warning for any
413such message.
414</p>
415
416<h3>Problem #1</h3>
417
418<p>
419The client reports: <b>Invalid connection state</b>. The server reports:
420<b>Invalid connection attempt: Signature mismatch</b>. This is a sign that
421a client has tried to connect using an invalid session key. Most probably,
422another instance of the client is/was started on the respective host.
423</p>
424<p>
425To fix: if you need to have concurrent access to the server,
426suspend the first process with SIGUSR2 before starting the second. Use
427SIGUSR2 again to wake up the first process. Give the process a second or two
428to return into the main event loop and go into suspend mode. Do not just use
429SIGSTOP/SIGCONT: it is important that the client tells the server that
430it will go into suspend.
431</p>
432
433<h3>Problem #2</h3>
434
435<p>
436The server reports:
437<b>Restart without prior exit</b> for a client.
438This is a sign that
439a client has re-started without informing the server about a previous
440termination.
441</p>
442<p>
443This would happen if the client was killed with SIGKILL, or if it terminated
444within the routine to send a message to the server (the routine is
445not re-entrant). You may want to investigate messages logged via another
446logging facility (e.g. the client's local logfile). Of course it <i>may</i>
447also be a segfault, which would be reported via syslog.
448</p>
449
450</div>
451</body>
452</html>
Note: See TracBrowser for help on using the repository browser.