From 8f8a9f01909ba29e2b781310baeeaaddc3f15f0d Mon Sep 17 00:00:00 2001 From: "Gerald W. Carter" Date: Tue, 22 Apr 2008 10:09:40 -0500 Subject: Moving docs tree to docs-xml to make room for generated docs in the release tarball. (This used to be commit 9f672c26d63955f613088489c6efbdc08b5b2d14) --- docs-xml/Samba3-HOWTO/TOSHARG-HighAvailability.xml | 500 +++++++++++++++++++++ 1 file changed, 500 insertions(+) create mode 100644 docs-xml/Samba3-HOWTO/TOSHARG-HighAvailability.xml (limited to 'docs-xml/Samba3-HOWTO/TOSHARG-HighAvailability.xml') diff --git a/docs-xml/Samba3-HOWTO/TOSHARG-HighAvailability.xml b/docs-xml/Samba3-HOWTO/TOSHARG-HighAvailability.xml new file mode 100644 index 0000000000..1ce81d404e --- /dev/null +++ b/docs-xml/Samba3-HOWTO/TOSHARG-HighAvailability.xml @@ -0,0 +1,500 @@ + + + + + &author.jht; + &author.jeremy; + + +High Availability + + +Features and Benefits + + +availability +intolerance +vital task +Network administrators are often concerned about the availability of file and print +services. Network users are inclined toward intolerance of the services they depend +on to perform vital task responsibilities. + + + +A sign in a computer room served to remind staff of their responsibilities. It read: + + +
+ +fail +managed by humans +economically wise +anticipate failure +All humans fail, in both great and small ways we fail continually. Machines fail too. +Computers are machines that are managed by humans, the fallout from failure +can be spectacular. Your responsibility is to deal with failure, to anticipate it +and to eliminate it as far as is humanly and economically wise to achieve. +Are your actions part of the problem or part of the solution? + +
+ + +If we are to deal with failure in a planned and productive manner, then first we must +understand the problem. That is the purpose of this chapter. + + + +high availability +CIFS/SMB +state of knowledge +Parenthetically, in the following discussion there are seeds of information on how to +provision a network infrastructure against failure. Our purpose here is not to provide +a lengthy dissertation on the subject of high availability. Additionally, we have made +a conscious decision to not provide detailed working examples of high availability +solutions; instead we present an overview of the issues in the hope that someone will +rise to the challenge of providing a detailed document that is focused purely on +presentation of the current state of knowledge and practice in high availability as it +applies to the deployment of Samba and other CIFS/SMB technologies. + + +
+ + +Technical Discussion + + +SambaXP conference +Germany +inspired structure +The following summary was part of a presentation by Jeremy Allison at the SambaXP 2003 +conference that was held at Goettingen, Germany, in April 2003. Material has been added +from other sources, but it was Jeremy who inspired the structure that follows. + + + + The Ultimate Goal + + +clustering technologies +affordable power +unstoppable services + All clustering technologies aim to achieve one or more of the following: + + + + Obtain the maximum affordable computational power. + Obtain faster program execution. + Deliver unstoppable services. + Avert points of failure. + Exact most effective utilization of resources. + + + + A clustered file server ideally has the following properties: +clustered file server +connect transparently +transparently reconnected +distributed file system + + + + All clients can connect transparently to any server. + A server can fail and clients are transparently reconnected to another server. + All servers serve out the same set of files. + All file changes are immediately seen on all servers. + Requires a distributed file system. + Infinite ability to scale by adding more servers or disks. + + + + + + Why Is This So Hard? + + + In short, the problem is one of state. + + + + + +state information + All TCP/IP connections are dependent on state information. + + +TCP failover + The TCP connection involves a packet sequence number. This + sequence number would need to be dynamically updated on all + machines in the cluster to effect seamless TCP failover. + + + + +CIFS/SMB +TCP + CIFS/SMB (the Windows networking protocols) uses TCP connections. + + + This means that from a basic design perspective, failover is not + seriously considered. + + + All current SMB clusters are failover solutions + &smbmdash; they rely on the clients to reconnect. They provide server + failover, but clients can lose information due to a server failure. +server failure + + + + + + + Servers keep state information about client connections. + +state + CIFS/SMB involves a lot of state. + Every file open must be compared with other open files + to check share modes. + + + + + + + The Front-End Challenge + + +cluster servers +single server +TCP data streams +front-end virtual server +virtual server +de-multiplex +SMB + To make it possible for a cluster of file servers to appear as a single server that has one + name and one IP address, the incoming TCP data streams from clients must be processed by the + front-end virtual server. This server must de-multiplex the incoming packets at the SMB protocol + layer level and then feed the SMB packet to different servers in the cluster. + + + +IPC$ connections +RPC calls + One could split all IPC$ connections and RPC calls to one server to handle printing and user + lookup requirements. RPC printing handles are shared between different IPC4 sessions &smbmdash; it is + hard to split this across clustered servers! + + + + Conceptually speaking, all other servers would then provide only file services. This is a simpler + problem to concentrate on. + + + + + + Demultiplexing SMB Requests + + +SMB requests +SMB state information +front-end virtual server +complicated problem + De-multiplexing of SMB requests requires knowledge of SMB state information, + all of which must be held by the front-end virtual server. + This is a perplexing and complicated problem to solve. + + + +vuid +tid +fid + Windows XP and later have changed semantics so state information (vuid, tid, fid) + must match for a successful operation. This makes things simpler than before and is a + positive step forward. + + + +SMB requests +Terminal Server + SMB requests are sent by vuid to their associated server. No code exists today to + effect this solution. This problem is conceptually similar to the problem of + correctly handling requests from multiple requests from Windows 2000 + Terminal Server in Samba. + + + +de-multiplexing + One possibility is to start by exposing the server pool to clients directly. + This could eliminate the de-multiplexing step. + + + + + + The Distributed File System Challenge + + +Distributed File Systems + There exists many distributed file systems for UNIX and Linux. + + + +backend +SMB semantics +share modes +locking +oplock +distributed file systems + Many could be adopted to backend our cluster, so long as awareness of SMB + semantics is kept in mind (share modes, locking, and oplock issues in particular). + Common free distributed file systems include: +NFS +AFS +OpenGFS +Lustre + + + + NFS + AFS + OpenGFS + Lustre + + + +server pool + The server pool (cluster) can use any distributed file system backend if all SMB + semantics are performed within this pool. + + + + + + Restrictive Constraints on Distributed File Systems + + +SMB services +oplock handling +server pool +backend file system pool + Where a clustered server provides purely SMB services, oplock handling + may be done within the server pool without imposing a need for this to + be passed to the backend file system pool. + + + +NFS +interoperability + On the other hand, where the server pool also provides NFS or other file services, + it will be essential that the implementation be oplock-aware so it can + interoperate with SMB services. This is a significant challenge today. A failure + to provide this interoperability will result in a significant loss of performance that will be + sorely noted by users of Microsoft Windows clients. + + + + Last, all state information must be shared across the server pool. + + + + + + Server Pool Communications + + +POSIX semantics +SMB +POSIX locks +SMB locks + Most backend file systems support POSIX file semantics. This makes it difficult + to push SMB semantics back into the file system. POSIX locks have different properties + and semantics from SMB locks. + + + +smbd +tdb +Clustered smbds + All smbd processes in the server pool must of necessity communicate + very quickly. For this, the current tdb file structure that Samba + uses is not suitable for use across a network. Clustered smbds must use something else. + + + + + + Server Pool Communications Demands + + + High-speed interserver communications in the server pool is a design prerequisite + for a fully functional system. Possibilities for this include: + + + +Myrinet +scalable coherent interfaceSCI + + Proprietary shared memory bus (example: Myrinet or SCI [scalable coherent interface]). + These are high-cost items. + + + + Gigabit Ethernet (now quite affordable). + + + + Raw Ethernet framing (to bypass TCP and UDP overheads). + + + + + We have yet to identify metrics for performance demands to enable this to happen + effectively. + + + + + + Required Modifications to Samba + + + Samba needs to be significantly modified to work with a high-speed server interconnect + system to permit transparent failover clustering. + + + + Particular functions inside Samba that will be affected include: + + + + + The locking database, oplock notifications, + and the share mode database. + + + +failure semantics +oplock messages + Failure semantics need to be defined. Samba behaves the same way as Windows. + When oplock messages fail, a file open request is allowed, but this is + potentially dangerous in a clustered environment. So how should interserver + pool failure semantics function, and how should such functionality be implemented? + + + + Should this be implemented using a point-to-point lock manager, or can this + be done using multicast techniques? + + + + + + + + + A Simple Solution + + +failover servers +exported file system +distributed locking protocol + Allowing failover servers to handle different functions within the exported file system + removes the problem of requiring a distributed locking protocol. + + + +high-speed server interconnect +complex file name space + If only one server is active in a pair, the need for high-speed server interconnect is avoided. + This allows the use of existing high-availability solutions, instead of inventing a new one. + This simpler solution comes at a price &smbmdash; the cost of which is the need to manage a more + complex file name space. Since there is now not a single file system, administrators + must remember where all services are located &smbmdash; a complexity not easily dealt with. + + + +virtual server + The virtual server is still needed to redirect requests to backend + servers. Backend file space integrity is the responsibility of the administrator. + + + + + + High-Availability Server Products + + +resource failover +high-availability services +dedicated heartbeat +LAN +failover process + Failover servers must communicate in order to handle resource failover. This is essential + for high-availability services. The use of a dedicated heartbeat is a common technique to + introduce some intelligence into the failover process. This is often done over a dedicated + link (LAN or serial). + + + +SCSI +Red Hat Cluster Manager +Microsoft Wolfpack +Fiber Channel +failover communication + Many failover solutions (like Red Hat Cluster Manager and Microsoft Wolfpack) + can use a shared SCSI of Fiber Channel disk storage array for failover communication. + Information regarding Red Hat high availability solutions for Samba may be obtained from + www.redhat.com. + + + +Linux High Availability project + The Linux High Availability project is a resource worthy of consultation if your desire is + to build a highly available Samba file server solution. Please consult the home page at + www.linux-ha.org/. + + + +backend failures +continuity of service + Front-end server complexity remains a challenge for high availability because it must deal + gracefully with backend failures, while at the same time providing continuity of service + to all network clients. + + + + + + MS-DFS: The Poor Man's Cluster + + +MS-DFS +DFSMS-DFS, Distributed File Systems + MS-DFS links can be used to redirect clients to disparate backend servers. This pushes + complexity back to the network client, something already included by Microsoft. + MS-DFS creates the illusion of a simple, continuous file system name space that works even + at the file level. + + + + Above all, at the cost of complexity of management, a distributed system (pseudo-cluster) can + be created using existing Samba functionality. + + + + + + Conclusions + + + Transparent SMB clustering is hard to do! + Client failover is the best we can do today. + Much more work is needed before a practical and manageable high-availability transparent cluster solution will be possible. + MS-DFS can be used to create the illusion of a single transparent cluster. + + + + + +
-- cgit