Proxy DHCP ipxe_efi Mode Timeout Error



  • Hi all, I work for a company tasked with coming up with an imaging solution after our (very old) CloneZilla setup bit the dust. After some researching I came to CloneDeploy as our solution. I am not a server/networking expert by any means but by following the various guides and forum posts out there I've largely been able to get this set up exactly how we need it. We have newer and older systems with Legacy and UEFI BIOS support so the only realistic way to use CD is with the Proxy DHCP mode.

    I've been able to get 95% of this working the way we need it to work. Individually, pxelinux works for our older, legacy systems, and Efi64 works for our newer UEFI systems. The issue comes from when trying to use this in Proxy DHCP mode; for some reason setting Efi64 to ipxe_efi (which works when not using Proxy mode) I am consistently getting timeouts after successfully downloading the NBP file and iPXE is initialized. It's entirely possible that this is down to how I have tftpd configured - I was never able to find a straightforward guide of exactly how to configure this for Proxy DHCP mode but was able to piece together a number of different guides/forum posts to get to where I am now.

    I have found a number of articles and solutions pointing mostly to networking or setup issues but nothing quite lines up with my exact experience, considering I can turn off Proxy DHCP, switch back to only ipxe_efi64 and the newer system will boot up and works fine. Booting into the Legacy system using the proxy mode works fine as well, it's just the newer UEFI based system I'm having trouble with while using the proxy mode.

    For some additional background, I am running the server on a Windows 10 machine using the included Tftpd64 client and version 2.1.2 of the proxyd service. This is currently a test system I'm mostly using as a proof of concept before moving to another permanent PC (likely still running Win10) so it's possible that re-doing this from scratch might fix whatever issue is going on that's causing this but I'm stumped at this point. If there's any additional troubleshooting or information I can use to point out potential issues with my setup that's causing this I'd greatly appreciate it.



  • When you run the proxy in debug mode do you see the requests coming in? Also when you enable the proxy are you disabling option 66 and 67 and your dhcp server?



  • I do see the requests coming in, yes. The MAC address shows up in the list and it does match the system so that seems to be working alright. I'm glad you mentioned the option 66/67 thing because this was the one piece of info I kept coming up with when researching the issues I've been having, but truly have no idea where option 66/67 is located or how to change it. I presume this would be an option in the Tftpd client I have to change but am not sure where or how I'd do that.



  • Since you are using the included dhcp server, just make sure the bootfile is empty, that's basically the same thing as option 67



  • Gotcha, in that case yes I did make sure the bootfile is empty.



  • I just re-attempted this and got a clipping from the dhcp client that might provide some insight:

    Rcvd DHCP Discover Msg for IP 0.0.0.0, Mac A8:45:E9:10:48:E8 [17/04 11:19:11.902]
    DHCP: proposed address 192.168.0.21 [17/04 11:19:11.904]
    Rcvd DHCP Rqst Msg for IP 0.0.0.0, Mac A8:45:E9:10:48:E8 [17/04 11:19:15.199]
    Client requested address 192.168.0.122 which was not allocated by tftpd32 and is either outside our pool or is used by someone else [17/04 11:19:15.200]
    Connection received from 192.168.0.122 on port 2034 [17/04 11:19:16.199]
    Read request for file <proxy/efi64/pxeboot.0>. Mode octet [17/04 11:19:16.199]
    OACK: <tsize=882048,blksize=1468,> [17/04 11:19:16.199]
    Using local port 61225 [17/04 11:19:16.199]
    Peer returns ERROR <User aborted the transfer> -> aborting transfer [17/04 11:19:16.201]
    Connection received from 192.168.0.122 on port 2035 [17/04 11:19:16.275]
    Read request for file <proxy/efi64/pxeboot.0>. Mode octet [17/04 11:19:16.275]
    OACK: <blksize=1468,> [17/04 11:19:16.275]
    Using local port 61226 [17/04 11:19:16.275]
    <proxy/efi64/pxeboot.0>: sent 601 blks, 882048 bytes in 1 s. 0 blk resent [17/04 11:19:17.200]
    Rcvd DHCP Discover Msg for IP 0.0.0.0, Mac A8:45:E9:10:48:E8 [17/04 11:19:17.698]
    DHCP: proposed address 192.168.0.21 [17/04 11:19:17.699]
    Rcvd DHCP Rqst Msg for IP 0.0.0.0, Mac A8:45:E9:10:48:E8 [17/04 11:19:17.719]
    Client requested address 192.168.0.122 which was not allocated by tftpd32 and is either outside our pool or is used by someone else [17/04 11:19:17.720]
    
    

    My pool starts at 192.168.0.20, why would this be trying to request an IP outside of the pool?



  • I'm starting to become increasingly more concerned that this is due to the unique setup of this new system of ours. The NIC is in an I/O hub on it's own that connects to the PC via a USB-C cable, so the mobo is separate from the nic. I did try booting through a USB ethernet adapter plugged directly into the PC where the motherboard is located just for the hell of it and had similar issues though. It seems like it keeps trying to assign an IP address that is 1 number outside the range that I have configured. So my starting pool IP ends in .20 with a pool of 200, the system tries applying an IP that ends in .221 and that's where it fails. The .122 one I posted about last was when I had the pool size of 100 so it's doing the same thing there.



  • Welp, I feel stupid. The router I was using to test this on had DHCP mode enabled. I realized this when I booted into windows on the machine that wouldn't properly start up and it was getting that same .122 IP address. At this point I figure it must have been router related and sure enough DHCP mode was on. Turned that off and voila, we're in business. 😎