ChangeLog of Virtual Server patch for Linux 2.2 Virtual Server patch for Linux 2.2.14 - Version 0.9.7 - January 19, 2000 Changes: * Just resolve a patch rejection on Configure.help for 2.2.14 Virtual Server patch for Linux 2.2 - Version 0.9.7 - December 22, 1999 Changes: * Fixed the huge timeout entry bug when destinations are unavailable When the destination server of a packet is found unavailable, the packet is droped silently but the entry is forgotten to be added back to the slow timer table. It would generate the entries of huge timeout. Thank Julian for the bug. * Changed two IP_VS_ERR calls to IP_VS_DBG Since the ipvsadm would report the error information when deleting a nonexist destionation or adding an existing service, there is no need to report error message in kernel. Thank Julian again for the change. * Added the sysctl_ip_always_defrag counting in ip_masq_new_vs This is for the coming kernel patch 2.2.14, where the wrong sysctl_ip_always_defrag handling is fixed. Virtual Server patch for Linux 2.2 - Version 0.9.6 - December 7, 1999 Changes: * Invalidate a persistent template when its dest is unavailable We define templates like (persistence for a single service) or (persistence for all services) are valid, and templates like are invalid. When new connection arrives and the destination of its template is not available, invalidate the template, then create a new template with new destination, and new connection is served. * Fixed the wrong debugging information in ip_vs_forward Virtual Server patch for Linux 2.2 - Version 0.9.5 - November 28, 1999 Changes: * Fixed the undefined variable bug in the IP_VS_DBG Due to my carelessness, an undefined variable was left in the IP_VS_DBG statement of the ip_vs_dr_xmit function. Thank Roberto Nibali for reporting. * Changed ICMP_PROT_UNREACH to ICMP_PORT_UNREACH in ip_vs_leave When virtual service is available but no destination is available, The ICMP_PORT_UNREACH icmp packet is sent to notify the client that the service is not available. Since IPVS is in IP layer, the TCP socket has been created, the TCP RST packet cannot be sent for TCP services, instead that ICMP_PORT_UNREACH is sent, no matter it talks TCP/UDP. Thank Julian. * Added port zero support for persistent services For some applications, there are more than one service, once a client is assigned to a real server for the first service, requests for other services from the same clients must be sent to the same server. Port zero is added for this kind of persistent services. * Fixed the bug that virtual ftp service blocks other services When virtual ftp service is presented and packets destined for other services not listed in ipvs table arrives, wrong masq entries will be created and those services are blocked. * Fixed the (null) print for unknown services in ipvsadm Thank Julian for reporting. Virtual Server patch for Linux 2.2 - Version 0.9.4 - November 10, 1999 Changes: * Julian fixed the fatal return bug of ip_vs_leave() Since some code of last version ipvs is changed, ip_vs_leave should return -2 instead of -3 if no virtual service is found. * Added the IPSKB_REDIRECTED flag The skb is set with the IPSKB_REDIRECTED and IPSKB_MASQUERADED flag, so that the system can detect infinite loop of TUNNELED/ DROUTED packets in the ip_local_deliver caused by misconfiguration. For example, user might configure the following: ipvsadm -a -t VIP:http -r -i ifconfig up then packets for VIP:http is tunneled to its own interface, which will causes infinite loop. * Fixed the bug that freed skb may be used to masq_set_state In the original ip_fw_demasquerade function, masq_set_state was called after ip_vs_forward, and ip_vs_forward may free the skb, so masq_set_state may operate the already freed skb. The current solution is just to simply do masq_set_state before ip_vs_forward. No matter whether the packet is forwarded successfully or not, the masq state will be updated. Although it brokes the original sematics, it won't lead to serious errors. We look forward to fixing it under the Rusty's netfilter framework both for correctness and modularization. :-) Many thanks must go to Julian for his very cute comments to the ipvs 0.9.3 code. He also raised a question, could we simply use ip_route_output to skip IPv4 forwarding and firewall to tunnel/ droute packets for a little bit performance, or should we be back to ip_route_input for correctness? I am still thinking about it. Virtual Server patch for Linux 2.2 - Version 0.9.3 - November 7, 1999 Changes: * Adapted the patch for kernel 2.2.13 Since the ntohl and like were changed to unsigned int(because the unsigned long int is 64-bit these days), some code in VS patch is modified for this change, and the compiling warnings and unnecessary casting can be avoided. * Changed the masq timeout type and the maximum persistent timeout The type of masq timeout was changed from 'unsigned' to 'unsigned long', in order to keep it the same as the type of timer_struct expires, then masq timeout will be 64-bit on 64-bit platforms. The maximum persistent timeout was changed from one year to one month, because this is enough. Thank Julian for the suggestions. * Added ICMP handling for IPVS The incoming ICMP packets for virtual services will be forwarded to the right real servers, and outgoing ICMP packets from virtual services will be altered and send out correctly. This is important for error and control notification between clients and servers, such as the MTU discovery. Sorry for adding this stuff so late, because I used to stupidly think that it is not easy to add ICMP handling for IPVS. After spending a couple of hours reading the textbooks and the masq code, I found that it was quite easy to add this stuff. Sorry! * Changed the tunnel/dr/local forwarding without doing masq_skb_cow Some orders in the ip_fw_demasquerade and ip_fw_demasq_icmp functions, so that the masq skbuff copy-on-write can be avoided in the tunnel/ dr/local forwarding methods. This improves performance for the tunnel/dr/local forwarding methods. * Use vmalloc to allocate big hash table. The big IPVS hash table of 256K entries or more can be allocated now. Virtual Server patch for Linux 2.2 - Version 0.9.2 - October 17, 1999 Changes: * Added support for netmasks with persistence The client source address is masked with this netmask for the purpose of accessing the templates. Added a new port to the service structure and changed ipvsadm to support this. Defaults to a 255.255.255.255, which emulates the old behaviour. (Lars Marowsky-Bree ) * Fixed the bug that server status checking doesn't work for LVS/NAT, and changed some comestics things for debugging. Thank Julian for the fix. Virtual Server patch for Linux 2.2 - Version 0.9.1 - October 6, 1999 Changes: * Fixed the counting bug in ip_vs_unbind_masq again Don't touch counters for templates. * Removed extra read_unlock in __ip_vs_lookup_service * Changed not to restart template timers if dest is unavailable If the client actively send packets when the destination is unavailable, the masq template can expire. * Added the destination trash The destination trash is used to hold the destinations that are removed from the service table but are still referenced by some masq entries. The reason to add the destination trash is when the dest is temporary down (either by administrator or by monitor program), the dest can be picked back from the trash, the remaining connections to the dest can continue, and the counting information of the dest is also useful for scheduling. * Added the ip_vs_leave function It is called by ip_fw_demasquerade when the matched service is avaiable but no destination is available for a new connection, to drop the packet. This should be a good behavior. * Changed drasticly removing the masq to silently dropping packets and keeping the masq in expire, when its destination is not available. It is a good behavior, when the destination is temporary down. The above fixes and changes won't be possible without Julian Anastasov's fixes and suggestions. Thank Julian! * Added the handling of weight=0 in every scheduler The destination with weight=0 is "quiesced" and will not receive any new connection, but will still serve the existing connections. This feature is useful to cool down the overloaded servers or to get some servers out of service for maintenance. * Added the update_service function in every scheduler When the destination list of a service is modified, the update_service function is called to reset the scheduling pointer, so that the scheduling pointer won't point to the freed destination. * Changed some IP_VS_ERR to IP_VS_DBG in the ip_vs_tunnel_xmit * Added different timeout support for persistent service Users can specify different timeout values for their different persistent services. * Fixed the bug that persistent service cannot be edited * Changed the output of ip_vs_procinfo for the new version of ipvsadm. Virtual Server patch for Linux 2.2 - Version 0.9.0 - September 24, 1999 Changes: * Added the hash table for virtual services It will greatly speedup the lookup of services. * Added new persistent service handling The template is looked up only if the service that the packet is destined to is persistent, so it is more efficient. For all the persistent services except FTP, we create a masq template like . So, the persistent services won't disturb each other, and it fixes the wrong accounting bug for different persistent services. FTP is a very complicated network protocol, and it uses control connection and data connections. For active FTP, FTP server initilizes data connection to the client, its source port is often 20. For passive FTP, FTP server tells the clients the port that it passively listens to, and the client issues the data connection. In the tunneling or direct routing mode, the load balancer is on the client-to-server half of connection, the port number is unknown to the load balancer. So, a template masq like is created for persistent FTP service. * Changed the destination lists to the d-linked lists * Changed the scheduler list to the d-linked list * Added back the least connection scheduling module. ---------------------------------------------------------------------- Virtual Server patch for Linux 2.2 - Version 0.8.3 - September 8, 1999 Changes: * Fixed the missing unlock bug in ip_vs_schedule. If no virtual service is found in ip_vs_schedule, this missing unlock bug will make system crash. * Fixed the uncounting bug in creating masqs by template. Missing to counter connections when creating masqs by template. * Don't touch counters in ip_vs_unbind_masq for templates Thanks must go to Julian Anastasov for the three fixes above. * Changed some condition orders for a bit performance * Changed some cosmetic things for debugging Virtual Server patch for Linux 2.2 - Version 0.8.2 - September 5, 1999 Changes: * Fixed the the IP_MASQ_F_VS_INACTIVE cleared bug after editing dest. Thank Julian Anastasov for the fix. * Added the separate inactive connection counter for each dest The WLC sheduler can use this counter directly for scheduling. And, the masq template won't be counted in inactive connections. Thank Julian Anastasov for the suggestion. * Changed all the schedulers modules to return server dest directly, and ip_vs_schedule creates new masq entry itself. Virtual Server patch for Linux 2.2 - Version 0.8.1 - September 2, 1999 Changes: * Uncomment a few statement to make virtual FTP via NAT really work. Virtual FTP service via NAT really work well no matter it is in active or passive mode. But, remember to "insmod ip_masq_ftp" before using FTP service through VS-NAT. * Remove some commented out block. The code looks nice. :) Virtual Server patch for Linux 2.2 - Version 0.8 - September 1, 1999 Changes: * Added the persistent port feature. Users can specify whether the virtual service port is persistent or not. It is more flexible. The original PCC scheduling is removed. * Added the dest server status checking. The server status is checked before forwording a packet. If the server is not available(down or put out of service), the packet will be dropped and the client will be notified immediately. The server status is also checked while generating a masq entry based on the masq template. If not available, the new entry won't be created. * Added some code in ip_masq_ftp.c to handle virtual FTP service for VS-NAT. The passive handling code in ip_masq_ftp.c never works. * Fixed stepping to mSR after SYN in INPUT_ONLY table. Thank Julian Anastasov for doing it. It make much much harder that a LinuxDirector is synflooded to run out of memory. * Fixed huge masq expire bug for after bad checksum. Thank Julian Anastasov for fixing it. * Added the IP_MASQ_F_VS_INACTIVE flag and fixed the connection counter Thank Julian Anastasov for the suggestion and fix example. * Fixed the incorrect lookup in hash table. The ms=NULL statement was forgot if no entry is found, this makes the incorrect lookup, which may lead to huge masq expire. Stupid mistake, but the result is serious. * Fixed the incorrect slow timer vector layout Correct layout and more efficient to use memory. * Fixed the bug of slow timer being added twice for masq template ---------------------------------------------------------------------- Virtual Server patch for Linux 2.2 - Version 0.7 - July 9, 1999 Changes: * Added a separate masq hash table for IPVS. * Added slow timers to expire masq entries. Slow timers are checked in one second by default. Most overhead of cascading timers is avoided. With this new hash table and slow timers, the system can hold huge number of masq entries, but make sure that you have enough free memory. One masq entry costs 128 bytes memory effectively (Thank Alan Cox), if your box holds 1 million masq entries (it means that your box can receive 2000 connections per second if masq expire time is 500 seconds in average.), make sure that you have 128M free memory. And, thank Alan for suggesting the early random drop algorithm for masq entries that prevents the system from running out of memory, I will design and implement this feature in the near future. * Fixed the unlocking bug in the ip_vs_del_dest(). Thank Ted Pavlic for reporting it. ---------------------------------------------------------------------- Virtual Server patch for Linux 2.2 - Version 0.6 - July 1, 1999 Changes: * Fixed the overflow bug in the ip_vs_procinfo(). Thank Ted Pavlic for reporting it. * Added the functionality to change weight and forwarding (dispatching) method of existing real server. This is useful for load-informed scheduling. * Added the functionality to change scheduler of virtual service on the fly. * Reorganized some code and changed names of some functions. This make the code more readable. ---------------------------------------------------------------------- Virtual Server patch for Linux 2.2 - Version 0.5 - June 22, 1999 Changes: * Fixed the bug that LocalNode doesn't work in vs-0.4-2.2.9. Thank Changwon Kim for reporting the bug and pointing me the checksum update problem in the code. * some code of VS in the ip_fw_demasquerade was reorganized so that the packets for VS-Tunneling, VS-DRouting and LocalNode skip the checksum update. This make the code right and efficient ---------------------------------------------------------------------- Virtual Server patch for Linux 2.2 - Version 0.4 - June 1, 1999 Most of the code was rewritten. The locking and refcnt was changed The violation of "no floats in kernel mode" rule in the weighted least-connection scheduling was fixed. This patch is more efficient, and should be more stable. ---------------------------------------------------------------------- Virtual Server patch for Linux 2.2 - Version 0.1~0.3 - May 1999 Peter Kese ported the VS patch to kernel 2.2, rewrote the code and loadable scheduling modules. ========================================================================== ChangeLog of Virtual Server patch for Linux 2.0 ---------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.9 - May 1, 1999 Differences with virtual server patch version 0.8: * Add Virtual Server via Direct Routing This approach was first implemented in IBM's NetDispatcher. All real servers have their loopback alias interface configured with the virtual IP address, the load balancer and the real servers must have one of their interfaces physically linked by a HUB/Switch. When the packets destined for the virtual IP address arrives, the load balnacer directly route them to the real servers, the real servers processing the requests and return the reply packets directly to the clients. Compared to the virtual server via IP tunneling approach, this approach doesn't have tunneling overhead(In fact, this overhead is minimal in most situations), but requires that one of the load balancer's interfaces and the real servers' interfaces must be in physical segment. * Add more satistics information The active connection counter and the total connection counter of each real server were added for all the scheduling algorithms. * Add resetting(zeroing) counters The total connection counters of all real servers can be reset to zero. * Change some statements in the masq_expire function and the ip_fw_demasquerade function, so that ip_masq_free_ports won't become abnormal number after the masquerading entries for virtual server are released. * Fix the bug of "double unlock on device queue" Remove the unnecessary function call of skb_device_unlock(skb) in the ip_pfvs_encapsule function, which sometimes cause "kernel: double unlock on device queue" waring in the virtual server via tunneling. * Many functions of virtual server patch was splitted into the linux/net/ipv4/ip_masq_pfvs.c. * Upgrade ippfvsadm 1.0.2 to ippfvsadm 1.0.3 Zeroing counters is supported in the new version. The ippfvsadm 1.0.3 can be used for all kernel with different virtual server options without rebuilding the program. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.8 - March 6, 1999 Differences with virtual server patch version 0.7: * Add virtual FTP server support The original ippfvs via IP tunneling could not be used to build a virtual FTP server, because the real servers could not establish data connections to clients. The code was added to parse the port number in the ftp control data and create the corresponding masquerading entry for the coming data connection. Although the original ippfvs via NAT could be used to build a virtual server, the data connection was established in this way. Real Server port:20 ----> ippfvs: allocate a free masq port -----> the client port It is not elegent but time-consuming. Now it was changed as follows: Real Server port:20 ----> ippfvs port: 20 ----> the client port * Change the port checking order in the ip_fw_demasquerade() If the size of masquerade hash table is well chosen, checking a masquerading entry in the hash table will just require one hit. It is much efficient than checking port for virtual services, and there are at least 3 incoming packets for each connection, which require port checking. So, it is efficient to check the masquerading hash table first and then check port for virtual services. * Remove a useless statement in the ip_masq_new_pfvs() The useless statement in the ip_masq_new_pfvs function is ip_masq_free_ports[masq_proto_num(proto)]++; which may disturb system. * Change the header printing of the ip_pfvs_procinfo() -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.7 - Febuary 10, 1999 Differences with virtual server patch version 0.6: * Fix a bug in detect the finish of connection for tunneling or NATing to the local node. Since the server reply the client directly in tunneling or NATing to the local node, the load balancer (LinuxDirector) can only detect a FIN segment. It is mistake that the masq entry is removed only if both-side FIN segments are detected, and then the masq entry expires in 15 minutes. For the situation above, the code was changed to set the masq entry expire in TCP_FIN_TIMEOUT (2min) when an incoming FIN segment is detecting. * Add the patch version printing in the ip_pfvs_procinfo() It would be easy for users and hackers to know which virtual server patch version they are running. Thank Peter Kese for the suggestion. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.6 - Febuary 2, 1999 Differences with virtual server patch version 0.5: * Add the local node feature in virtual server. If the local node feature is enabled, the load balancer can not only redirect the packets of the specified port to the other servers (remote nodes) to process it, but also can process the packets locally (local node). Which node is chosen depends on the scheduling algorithms. This local node feature can be used to build a virtual server of a few nodes, for example, 2, 3 or more sites, in which it is a resource waste if the load balancer is only used to redirect packets. It is wise to direct some packets to the local node to process. This feature can also be used to build distributed identical servers, in which one is too busy to handle requests locally, then it can seamlessly forward requests to other servers to process them. This feature can be applied to both virtual server via NAT and virtual server via IP tunneling. Thank Peter Kese for idea of "Two node Virtual Server" and his single line patch for virtual server via IP tunneling. * Remove a useless function call ip_send_check in the virtual server via IP tunneling code. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.5 - November 25, 1998 Differences with virtual server patch version 0.4: * Add the feature of virtual server via IP tunneling. If the ippfvs is enabled using IP tunneling, the load balancer chooses a real server from a cluster based on a scheduling algorithm, encapsules the packet and forwards it to the chosen server. All real servers are configured with "ifconfig tunl0 up". When the chosen server receives the encapsuled packet, it decapsules the packet, processes the request and returns the reply packets directly to the client without passing the load balancer. This can greatly increase the scalability of virtual server. * Fix a bug in the ip_portfw_del() for the weighted RR scheduling. The bug in version 0.4 is when the weighted round-robin scheduling is used, deleting the last rule for a virtual server will report "setsockopt failed: Invalid argument" warning, in fact the last rule is deleted but the gen_scheduling_seq() works on a null list and causes that warning. * Add and modify some description for virtual server options in the Linux kernel configuration help texts. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.4 - November 12, 1998 Differences with virtual server patch version 0.3: * Fix a memory access error bug. The set_serverpointer_null() function is added to scan all the existing ip masquerading records for its server pointer which points to the server specified and set it null. It is useful when administrators delete a real server or all real servers, those pointers pointing to the server must be set null. Otherwise, decreasing the connection counter of the server may cause memory access error when the connection terminates or timeout. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.3 - November 10, 1998 Differences with virtual server patch version 0.2: * Change the simple round-robin scheduling to the weighted round-robin scheduling. Simple is a special instance of the weighted round-robin scheduling when the weights of the servers are the same. * The scheduling algorithm, originally called the weighted round-robin scheduling in version 0.2, actually is the weighted least-connection scheduling. So the concept is clarified here. * Add the least-connection scheduling algorithm. Although it is a special instance of the weighted least-connection scheduling algorithm, it is used to avoid dividing the weight in looking up servers when the weights of the servers are the same, so the overhead of scheduling can be minimized in this case. * Change the type of the server load variables, curr_load and least_load, from integer to float in the weighted least-connection scheduling. It can make a better load-balancing when the weights specified are high. * Merge the original two patches into one. Users have to specify which scheduling algorithm is used, the weighted round-robin scheduling, the least-connection scheduling, or the weighted least-connection scheduling, before rebuild the kernel. * Change the ip_pfvs_proc function to make the output of the port forwarding & virtual server table more beautiful. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.2 - May 28, 1998 Differences with virtual server patch version 0.1: * Add the weighted round-robin scheduling patch. -------------------------------------------------------------------- Virtual Server Patch for Linux - Version 0.1 - May 26, 1998 * Implement the infrastructure of virtual server. * Implement the simple round-robin scheduling algorithm. --------------------------------------------------------------------