Sunday 25 February 2024

Expansion board revD and new connector PCBs

Spoiler: This is the new revE expansion board, and matching bridge board to connect it to the MEGA65 motherboard.  Read below for the full story and adventure as usual, including to find out what that big hole in the bridging board is all about...

The new revD MEGA65 expansion board PCBs arrived today, as well as the prototype connector PCBs to replace the need for cables to bridge between the expansion board and the MEGA65 motherboard. Yay! So now it's time to see how well I did with the various things I was trying to fix / implement.

First, the hole positions on the revD expansion board are essentially in the right places, although I realised that I still missed one whole hole in the motherboard. Fortunately it's not in a position that will cause physical interference, so I'm going to ignore it for now. We'll call that a victory. 

Second, the thin connector PCB that joins the PMODs of the MEGA65 to those of the expansion board is indeed thin enough that it doesn't cause physical interference with the MEGA65 keyboard, which was a bit of a worry.  So that's another good victory. You can see from the shadow in this photo that there is clearance between the two, and that's without the connector PCB being fully inserted, which will increase the free space. This was a pain to photograph, because the inside of the case is in the dark, of course. So I had to use the flash, which resulted in annoying external shadows. But as said, the connector PCB, the thing with the two sets of 12 pins poking through, is well and truly clear of the keyboard PCB, i.e., the big green PCB covering most of the trapdoor.

We can also get a general idea of how the various boards will fit together in the picture below:

The expansion board sits in the bottom of the case, as you'd expect, and then the thin connector board joins the PMODs on the front edge of the two boards, and then the other bridge board connects the JTAG, floppy connectors and power connectors between the MEGA65 main board and the expansion board.  As you can probably already see with that PMOD connector board, the relative positions of the connectors is not quite right...

In fact, almost all the connectors on both boards are not correctly located. I did try during the initial design of these boards to get all the measurements right, but I failed fairly dismally in the end.  I'm a bit annoyed with the result, partly because I can see several of the problems that I really should have been able to pick up during the design phase, as well as others that were always likely to be a bit tricky, but I had tried to get right by printing pieces of paper with the PCB outlines and connector positions so that I could validate them before manufacture.  It's probably not surprising that errors of upto about 1mm remain after the paper method, as it was quite hard to line things up when the MEGA65 main board is fully populated, and I didn't have a printable version of the bare board to do the paper trick for that.

All in all it's ok, because much of the point of making these prototypes was to identify exactly this kind of issue, and the chances were slim that I'd get them _all_ right first time. However, getting at least some of them right first time would have been nice... What I am going to do, though, is make the changes to the bridge and connector PCBs, rather than the expansion board PCB, in the hope that I don't have to change the expansion board PCB, partly because it's more work to do so, and it would be nice to not waste those boards that I have had made.

So I soldered some connectors on to some of the boards so that I could work out the errors in the positions, especially between the MEGA65 motherboard and the expansion board.  I'm noting the required changes down as I go:

P2 - move right approximately 0.5 mm

J9 - move left approximately 0.75 mm

J10 - move left approximately 3.0 mm

JB1 - move up approximately 0.5 mm

JB1 - move right approximately 0.5 mm

J1 - move up approximately 0.25 mm

J1 - move right approximately 0.5 mm

J4 - move left approximately 1.0 mm

J4 - move down approximately 2.0 mm

J2 - move to have the correct relative position to J4.

JTAG RELAY - move to have the correct relative position to J4

All of those "approximately" measures are those that relate to the relative position of the MEGA65 PCB and the expansion board PCB, and/or the unknown exact relative positions of the connectors on the MEGA65 motherboard.  Some I can deduce from the drawings that I have managed to find (and I retrospect should have found and done before sending the PCBs off to manufacture...) such as:

J1 should have the holes centred horizontally compared to J3, based on drawings of the MEGA65 main board.   But vertical distance relative to J3 cannot be determined.

Well, that's actually the only one I can be certain of. 

P1 and P2 should almost certainly be 0.9 inches apart, as they are based on 2.54mm = 0.1" spacing.

J9 and J10 should also be similarly aligned to 0.1" of an inch, although they seem to be 0.8" apart, rather than 0.9" apart, which I can at least verify from the expansion board PCB. But I'm still relying on hand measurements to work out their horizontal displacement relative to P1.

I just don't have the drawings to improve the rest. So I'm just going to have to trust my measurements of the errors, and try to get those accurate enough, that the connector placements all fall within acceptable tolerance.

I've now gone through all the changes, and I think I have everything correctly placed. But then, I did last time, too.  I'm going to sleep on it, and check it all again in the morning.  I might also see if I can't relate the positions between the MEGA65 main board drawing and the expansion board PCB using the mounting holes in the case, as they do provide an absolute reference.  I just need to make sure I measure relative to one of the holes that is well centred, rather than H3 and H9 that are a little offset to one side.

Well, it's another day, and I've converted the PMOD connector board to 2-layer instead of 4-layer, which reduces the build time to ~24 hours instead of 4 - 5 days and ordered the two boards as is, as I needed to put in an order to PCBway, anyway, for some C64 cartridge break-out boards to debug an unrelated problem.  So I should have the new boards within a week, probably on the 29th or 30th Jan.

The conversion to 2-layer for the PMOD board required the board to be a little wider, but it should still comfortably fit in the case with a couple of millimetres of clearance between the edge of the board and the bottom case.

Anyway, they are ordered, and should be en-route to me here in Australia within 24 hours, and at my door early next week.  I'll continue this post when they arrive.

The boards have arrived and they very nearly fit nicely. I can force it all together, but then the screw holes don't line up exactly.  But it gives a nice idea of the final appearance. But I am going to have to re-spin at least the bridge board, and probably the PMOD connector board, too.


I do still like how nice and neat it is when it is all hooked up.  The spacing between each PMOD in a pair of PMODs on the PMOD connector board is fine.  But the spacing between the pairs of the PMODs is about 0.5mm too narrow. It can be forced, but it really is a force.

For the bridge board, it looks like the JTAG connector on the MEGA65 main board side need to be shifted up about 1mm, and the other connectors on the MEGA65 side up by about 0.5mm. I would like to double check that with a bare board MEGA65 board, but I don't have one.  So I've emailed Trenz Electronic to see if I can get exact measurements from their design files.  But the bridge board does fit easily onto the expansion board:

The third and final issue I have seen so far is that I wasn't able to get JTAG working via the bridge board -- the serial interface works on the TE0790, and while the JTAG interface gets detected, every time I have tried to send a bitstream via JTAG it has failed.  So I'll need to investigate that further.

The first thing to try is to use a cable to do the JTAG connection instead of the bridge board. That way we will know if the revD expansion board can do JTAG at all. If that works, then I'll suspect the bridge board. If it doesn't work, then the expansion board will need another re-spin.

Interestingly, with a cable instead of the board, I can't even get the JTAG adapter to be recognised on the revD expansion board. I'm suspecting that the more circuitous routing on the revD expansion board for this connector is causing problems.  But that shouldn't stop the JTAG adapter being detected, only prevent it from succeeding with JTAG operations. Squeezing the cables around I can get the JTAG device appearing, but the JTAG communications are still thoroughly messed up.

I think the solution here will be to put the relay connector much closer to the TE0790 connector. I might also try putting the header for the TEI0004 on the revD board I have assembled, and test that, as it is at the end of the traces, so should not have any reflection problems.  It might in fact be the presence of that connector that is pushing things over the edge.

Hmm, I found one problem with the cable: One of the pins has broken the little housing it is in, so it was only making intermittent connection. That's probably why the JTAG interface was only being intermittently detected. 

So, back to the JTAG not working to load a bitstream, if I have a TEI0004 fitted and connect the cable to where I would have the TE0790 normally, then I can push a bitstream over JTAG without problem. The serial monitor interface also works. This results in a short and direct path between the TEI0004 and the off-load from the expansion board, as they are basically next to each other:

 


So now let's try moving the cable to the header where it is supposed to go. In this configuration that path will still not have reflection problems, because it will still be linking end-to-end over the circuit path, but it will have the more convoluted routing in the path:

And that works fine, too, being able to push a bitstream to it.  So that's a convenient enough work around for my test board.  But I would still like to get to the bottom of this problem, as the reflections weren't a problem with the previous revision. My gut feeling is that the combined impact of the reflections from the traces to the TEI0004 socket and those from the vias and other routing changes that I made are enough to tip it over the edge.

Now I should solder up another bridge board with JTAG relay headers and confirm that it works with the bridge PCB, rather than the cable.  I'm not expecting any problems, as the bridge PCB should result in better not worse signal integrity. But for completeness I should check this. And it works!


Now I want to confirm that the reflections are the problem here, by soldering the TE0790 and relay connectors onto an unpopulated revD expansion board, but only after cutting the traces to the TEI0004 header.  If that makes it work, then we know at least one of the factors involved -- although the worsened routing must be having an impact as well, since the TEI0004 header was fine to be connected on the revC board. The challenge is going to be accurately cutting those traces. We'll see how I go, but maybe after lunch.

I had a quick go, but I don't really have the tools on hand to do it confidently.

Anyway, as much as I might be trying to avoid it, I think the solution is that I need to do another revision of both the bridge board, but also the expansion board itself.  The PMOD connector board will be okay as it is, because I can move the PMOD connectors on the expansion board to correct for the placement offset.

What I think I need to do is to move U6 and U7 towards where the TE0790 relay header is currently located, and then move the TE0790 relay header to be as close as possible to the TE0790 connector itself.  It's also an open question as to whether I leave the TEI0004 connector on the expansion board, as I think most people have a TE0790, anyway.

Reworking the expansion board will let me do a couple of other helpful things: First, if I test the A/V output lines with just 2 resistors, and doing the 8x over-drive trick that I do on the least significant bit, and confirm that this works, then that will let me free up 6 IO pins.  I can then use some of those to directly connect the C1565 serial link without going through the ring buffers.  And the left over ones I can route to another header to the bridge board, and perhaps stick a WiFi enabled ESP32 on the bridge board, for the folks who would really like to have a built-in WiFi interface for the MEGA65.

The first step is to make a bitstream that just drives the 500 Ohm and 8x 500 Ohm = 4K Ohm resistors in the A/V outputs to see if we still get decent video quality.  I'll just ignore the other 2 resistors for now. Let's start with our test-pattern on digital video output as reference:

So we have a set of nice colour gradients and colour bars. Now with our existing 4-pin DAC implementation:

The camera doesn't really capture it particularly accurately. But it's pretty clear (and totally expected) that the colour reproduction for composite video is much worse than a puree digital video output. For real-life use, it actually looks not too bad, at least as good as a C64 on a real TV did back in the day, and probably on a par with component video output to a proper CRT monitor -- and this is with the MEGA65 just producing a single composite output. With separation into chroma and luma channels, it should be better again. I just haven't implemented it yet.

But dropping to 2 pins only results in a much worse outcome, with colours distorted and quite washed out:

Either we need to increase the frequency of the over-sampling, or we need more pins, or both.

Thanks to some help from the CVBS community, I looked at this nice calculator to design a low-pass filter for the A/V output: https://markimicrowave.com/technical-resources/tools/lc-filter-design-tool/, but I'm not totally confident working with filters. For example, if we are using a ~500 Ohm based resistor ladder, does that mean that our input impedance is ~500 Ohm? I think so, but I'm not totally sure.

So I think the step before that is to see how 3 bits of output performs, without filter or over-sampling the bottom bit.  If it turns out to be enough for good colour PAL reproduction, then that's a convenient solution to freeing up a few pins.

Ah, except in the investigation I found that I am probably not looking at the composite channel at all, but rather at the luma channel, and it's just leakage between traces causing a very low-saturation colour effect on the signal.  That is, I'm probably connected to the wrong pin on my video output cable. Also, the cable is connected to the revC expansion board, which I recall now has an issue with one of the channels.  

To work around this, I'm going to do something that I had intended to do anyway, which is to make each output channel configurable. This will also allow the board to output composite, S-Video, component video and other formats in future.  That's now synthesising, and will allow selection of each output on the A/V connector independently, initially just from chroma, luma, composite and left and right audio. This will be via $FFD8000-2, with the lower 4 bits of each register selecting the source.  I'll document it further in future.

In the process I also noticed some bugs with NTSC colour generation, and that I was not actually producing a pure luma signal, except when the MONO bit of the VIC-III was enabled.  I also messed up the source select logic, so that any IO write would update the source, rather than just one targetted at the correct address.  So reconfiguring with all those fixes. Will be nice to see if it fixes the weird NTSC colour problem issue (beyond NTSC's normal colour weirdness ;)...

Well, it didn't fix the NTSC problem, but it does look like I can select the various outputs on the various channels of the A/V plug now, which is good. So now to try again to reduce to 2 pins with 3-bits per pin.  And the result is much better. The colour is still a bit washed out compared to using 4 bits directly, at least I think so. But it is way better than before when I had stuff things up in various ways. To get an idea of the difference:

The images in the left-column use only 2 pins each, with the 8x over-sampling used to get 3 bits from each pin, for a total of 6 bits of resolution. The right-column uses 4 pins, with three pins directly connected to resistors, and the 4th pin using 8x over-sampling to get 3 bits instead of 1. The result is 1+1+1+3 = 6 bits of resolution. In other words, the two should be effectively equivalent in terms of output.  

Now, my camera doesn't really do a perfect job of capturing the colour, or all of the video artefacts for that matter.  In particular, the colour gradients are much better in real-life, and the colour intensity is also a bit better. At least with the DELL monitor I am testing with. With real CRT monitors, the results could be totally different.

Anyway, the overall difference is a bit reduced colour saturation and brightness when using only 2 pins.  I have every reason to believe that those can be corrected in VHDL, at least partially.  Thus I feel safe in moving to only 2 pins per colour channel, and thus recovering 3 channels x 2 pins = 6 pins total.

I had a problem where the monitor was not getting VSYNC or correct HSYNC, and I wasted quite a few hours chasing my tail on this. In the end turning the monitor off and on again fixed it. Of course, this means that something must still be a bit marginal.  To help debug that, I have added sawtooth and sine wave test signal sources to the A/V output, so that I can verify that the resulting waveforms looks fine.  

The complication for me to evaluate this, is that my oscilloscope has a bandwidth of 100MHz, which means that it is not integrating the over-sampled signal as much as I would like, and as much as I expect typical CRT monitors will do.  That said, the sawtooth waveform already looks pretty good. The sine curve had some stupid errors in the generation, so I'm resynthesising that right now.  As mentioned, the higher bandwidth of the oscilloscope means that we will see oscillation around the mean point of the signal.  

That said, as the images below show, it's still really not that bad:

First up, we have the saw-tooth waveform:

There is more noise at low values, which is not so great, presumably at the level where only 1 pin is really being active.  Also, it's not totally linear, but it's still not that bad.

Then for the sine-curve we get a really nice clear sine waveform visible, again with noise, especially at lower absolute values:

If we zoom in on the time domain, we can see that the noise is substantially caused by rapid oscillations, i.e., the over-sampling method we use to generate the signal:
The oscilloscope is showing an approximate frequency of this oscillation of ~65MHz, which is a bit of a mystery, as our pixel clock rate is 27MHz, so I'd expect it to be an integer multiple of that, but things are never that simple. Anyway, what it clearly shows is that if we do add a filter, we would like at least 10dB of attenuation by 2x27MHz = 54MHz to really smooth this out. Anyway, it's enough for our purposes right now.

Otherwise, the main issue with the video is we have a bit of a red tint on the composite video, so we will need to adjust the coefficients that control that at some point. That doesn't require changing the hardware, so that's good enough for now.

A more annoying issue is that the expansion board is causing spurious resets, presumably because it thinks the user port /RESET line is being pulled low.  This is despite the /RESET pin on the user port seeming to be sitting comfortably at +5V the whole time. This makes me think that maybe the ring buffer for reading the pins is a bit wonky.

My best guess here is that the latching of the read ring is a not rock-solid. What I should be doing is reading the previous bit on the instant just before I clock the next bit. To debug this I have added a few things: 1) I have made the controller check that the ID lines are indeed one high and the other low, which should prevent framing errors from sneaking through; (2) In the process, I have made it easier for future revisions of the expansion board to offer different ports; (3) Added the option to mask the user port reset line, in case it is still causing problems; (4) Count the number of reset events on the user port, in case that is the problem; (5) Allow direct reading of the input ring from the expansion board; (6) Generally document the registers that control all this stuff.

With (1) in place, the spurious resetting has totally stopped, so I'm guessing that was the problem.  I'll leave it at that for now.

Meanwhile I now have the dimensions sheet for the MEGA65 main board, so I should be able to refine the positions of the connectors to the MEGA65 main board. That also reminds me, that on the R4/R5/R6 boards, we now have an analog audio output header that is conveniently located enough that I should be able to make it an option to pipe to the audio pin of the analog video port, which is an other advantage.

So all in all, I think we can add the WiFi adaptor, as well as improve the plumbing to the C1565 port. This will be in addition to the correction of the connector locations, and re-routing the TE0790 relay connector so that JTAG can work reliably.  That sounds to me like more than enough changes for one revision of the board.

So let's look at those connector positions, and work out what needs adjusting, and by how much, now that I have the ground-truth relative positions of the MEGA65 main board. But before that, I really should double-check that everything that did work on the revC still works on the revD, and that the positional corrections I made for that are correct.

One thing I noticed is that the floppy connector on the expansion board lines up with one of the screw points for the floppy drive:

This isn't a show-stopper, but it does require more cable origami to get the cable to be able to get into the connector:

I can move the header on the expansion board towards the front of the MEGA65 by 18mm to clear it. This will slightly increase the cost of the PCB, but it shouldn't be too tragic. The main question really, is whether moving the cable that far forward will collide with the back of the keyboard.

I can test this by purposely mis-seating the floppy cable, and seeing if I can still close the case. Each position is a movement of 0.1" = 2.54mm. So 18mm = 7.1 positions, so I'll try 7 positions.

It's close, but doesn't quite fit. Now, part of the reason for that is that the edge of the keyboard PCB is still just over the connector. If I move the floppy connector to the right 7mm, it should clear it comfortably. This will require extending the right edge of that part of the PCB, but there is comfortably room to do this, so that's fine.

I also double-checked that there is still space for the TE0790 behind the floppy cable, as I did move that down on the last revision. It does still fit, but with not too much space to spare!

The tape port still works, and I'll assume that the user port still works, too, since they are both on the same ring buffer loop. I do get errors loading from the test tape sometimes, but at least 1 out of 3 times I am able to correctly load a game from tape that uses a tape fast loader. So I'm going to assume for now that it is likely the age of the tape and datasette, rather than the expansion board that is to blame here.  I might later experiment with adjusting the ring buffer clock speed to see if it makes it better or worse.

I'm now going through the process of updating the schematic and PCB for the various changes. 

I'll start with simplifying the resistor ladders for the video output channels: Instead of using 4 pins for each, I'll use only 2, with 500 Ohm and 4K Ohm, constructed using only 1K resistors in series and parallel, which will simplify the bill of materials. This means each channel now looks like this:


That's a net increase of 2 resistors per channel, which I'll need to fit on the PCB, but that should be ok.  

I'd also like to add provision for optional low-pass filters on the three channels.  Based on the oscilloscope probing, we need to have at least 10dB roll-off by ~54MHz.  I'm using the great filter design tool at https://markimicrowave.com/technical-resources/tools/lc-filter-design-tool/.  The output impedance of analog video is normally 75 Ohms, and the input impedance is approximately the resistance of the lowest resistance rung of our resistor ladder, so 500 Ohm in this case.  Out cut-off frequency can be set to 13.5MHz, the maximum channel bandwidth for a PAL or NTSC signal -- in reality it probably only needs to be half that, but it's certainly enough.  So let's put the appropriate parameters into the model:

and see what the response of the filter looks like:

Well, that looks pretty okay to me. The insertion loss below 13.5MHz is about 2dB, which means that about 2/3 of the input voltage will be able to pass through. Then we have about 3dB filter loss below 13.5MHz, so our total loss for the frequency range we want is maybe 5dB. That's a bit more than I'd like, as it means that our output voltage will be divided by about 3.1. Given our output impedance of 500 Ohm and the input impedance of a monitor at 75 Ohms, this means that before the filter, our output voltage should be something like 3.3V x 75 / (500 + 75) = 3.3V x 0.13 = 0.43V, which is already at the lower end of things, since the target peak voltage is around 0.7V.  The filter will divide this by a further factor of 3.1, to give a peak voltage of only ~0.14V, which is really quite low.

We can improve this by reducing the output impedance from 500 Ohms to 250 Ohms, which will bring the peak output voltage up to 0.76V, which is safe without further reduction by the filter, and still allow the voltage after filtering to nearly double to ~0.25V ... except that filters are never that simple. We need to update the filter model to reflect the 250 Ohm input impedance, which gives this response curve:

In other words, even though we have nearly doubled the input impedance (for a gain of ~3dB), the input loss of the filter has increased by almost exactly the same amount! That is, our filtered signal will not have any greater voltage range than with an output impedance of 500 Ohms. Grr! Well, in any case, 500 Ohms over 3.3V = 6.6mA, and 250 Ohms would mean 13.2mA, which is still within the max current limits of the FPGA, so let's halve the output impedance, but add a bypass-able resistor that increases the impedance back to 500 Ohms by adding a 250 Ohm resistance in the path, as well as making the output filter selectable.  

The main trade-off here will be adding yet more components on the board, as it would require either one 250 Ohm resistor (which I don't have in stock), or more conveniently, four 1K Ohm resistors in parallel, but that would take the most space.  The filter also requires 3 components, and then we would have two jumpers to allow the bypassing of the extra output impedance and the filter. So in the worse-case scenario, this would mean we would have the existing 6 resistors for the resistor ladder, plus 3 filter components, 2 jumpers to select function, and between 1 and 4 resistors for the output impedance trim resistors. Multiplied by 3 channels. That's probably more than I can realistically cram in, even given the extra space available from increasing the board size to accommodate relocated connectors.

I thought I would just check the voltage I get on the video channels with a 75 Ohm load, just to be sure. I didn't have a 75 Ohm resistor handy, but I did have a 68 Ohm resistor, which should be fine, just resulting in slightly higher voltages than with a real monitor. I was expecting to see ~ 3.3V x 75 / (500 + 75) = 0.43V. But instead I am seeing a peak voltage of more like 4V instead of 0.4V... which is also the voltage I see when it's open circuit. Ah, I was measuring it incorrectly. It is indeed somewhere around 0.4V, so halving the output impedance to 250 Ohms makes sense, and I won't need the impedance selection jumper and resistor -- just the filter and filter bypass jumper.

So let's start moving things around and see how much space I can free up for the filters etc.  I've also flipped the connector over that carries the JTAG signals to the bridge board, to avoid the need for vias, that would degrade the signal integrity.  The difference in routing is now quite profound with those changes, with the JTAG lines now running simple straight paths, without any vias at all, and no non-JTAG signals crossing the path of the JTAG signals.  I really think that they should have much better signal integrity now. You can see the improved routing between the two 12-pin TE0790 connectors here:

As a reminder of what it looked like before:

There were those two ICs in the way, and crazy circuitous routing to get around them and the existing circuitry. I also moved J8 further up the board, to minimise the path lengths, so I'll have to update the bridge board position of J8, as well as horizontally flipping it (which should also remove some vias on that board, too).

I've also moved the PMOD connectors on the expansion board left by 0.5 mm, so that the existing revD PMOD connector PCB spacing will be exactly correct. Well, I hope exactly correct.  I also adjusted the position of the other MEGA65-board side connectors to try to get it all right.  This was a bit fiddly to work from the various precise and non-precise sources of information I had, and to then put the positional markings on the bridge board PCB layout as measurements, to make sure that I have everything as correct as possible:

You can also see that I have added place for the ESP32 WiFi module to the middle of the PCB:

I have included a complete cut-out zone for the external antennae, located below the straight and direct traces for the TE0790 relay to the expansion board. Note that I have also added an extra TE0790-compatible connector to this bridge board, to allow easy connection and debug of the ESP32 module. That is, the header labelled "ESP32 JTAG+UART (TE0790)" is for that purpose alone.  I've also added a header for the ESP32 sensor pins, in case that turns out handy for something.

I still need to add a few more break-out pins for the ESP32, so that all the bootstrap pins are exposed conveniently, in case I need to adjust them. I've done that now, as well as adding a dedicated ESP32 UART break-out connector for convenience.

I'm thinking that that is probably about it for the bridge board.  Now I need to add the plumbing to connect this to the expansion board beneath it. That's done as well now, using 3 of the 6 lines to connect UART RX & TX, as well as the WIFI_EN line, which needs to be 3.3V.  That leaves only 3 pins for the C1565 interface improvements, which hopefully will be enough.

So let's look at the C1565 interface improvements.  Ideally we want the ring clock and latch as well as the SERIO data pin to be fast by being directly connected to the FPGA pins.  Three signals, three pins, sounds easy. The trouble is that those are all 5V on the C1565 plug, but 3.3V on the FPGA pins, and the SERIO pin is bi-directional.  But let's start with the easy ones: We can just connect the C1565 clock to the expansion board's existing ring clock.  So that just leaves the latch and data pins. 

The latch is fairly straight forward: It's an output only from the FPGA to the C1565 port, so we just need a 3.3V to 5V level converter. If we were lucky, we'd have a spare buffer on one of the 74LS125's, but of course they are totally fully. So we will need at least one more IC.  But before we just go an add another 74LS125, let's see what we need to make the SERIO line bidirectional. We can do the output direction using part of a 74LS125. But for the input direction we need to level convert from 5V to 3.3V. SERIO is output when the latch line is high, and input when low, like this:

2.5.5. F011 Disk Expansion Port Serial Protocol

     +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +--
     | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CLK
   --+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+

    +---------------------------------+                     +--
    |                                 |                     |  LD
   -+                                 +---------------------+

   --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
     |LED|MOT|STP|DIR|SID|DS2|DS1|DS0|SPR|DKI|DKC|IND|PRT|TK0| SERIO
   --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-

So we can in fact control the output control of SERIO using LD: When LD is high, then we want the output enable on the 74LS125 buffer for SERIO.  I'll connect that now, and then think about what to use for the input side.  So the outputs are connected, and that uses 2 of the 3 pins, and I still have one pin free for the input.  So I don't even need to do anything fancy to switch the direction if I don't want to. I can just use a simple "always on" input buffer.

Meanwhile, I am hitting a problem with KiCad 7 complaining about the user port and tape port edge connectors.  Older versions of KiCad didn't have this problem, and it is apparently a known false alarm, but still quite annoying. It seems the two options are to ignore the errors in the design rule checker (which I never like doing), or adding a bit of solder mask back between the each pin on these edge connectors, which is also not that great. I guess I'll have to add exceptions to the design rule checker...

Okay, so now that I have the design rules passing again, I can go back to planning how I am going to read the SERIO line. I think I'm just going to add yet another 74LS125 powered on 3.3V rail and with a 1K Ohm resistor on the input from SERIO, to limit the maximum current to 5V / 1K Ohm = 5mA, which should be fine.

Okay, so I think I have SERIO input now correctly plumbed in. The schematic page is getting a bit crowded, though, so it's not as immediately obvious if I have it right or wrong, which I don't like.  So I'll just have to go over it carefully for now.

I'm going to pull out the 1565 port stuff onto a separate sheet in the schematics so that I can reason about it much more easily. This is the result:

You can also see that the WIFI_EN signal for the ESP32 wifi adapter is also generated by this block. This is because it needs to be 3.3V, and I didn't want to waste a whole FPGA pin on a line that might toggle once every time you start a wifi-enabled program.  

I have used the FPGA pin that freed up to separate the C1565 port clock from the ring buffer clock, so that they can be clocked at different rates. This is to allow the C1565 bus to be slower, which will likely be required if we want to allow reasonably long cables to external drives.

The only problem I hit in the above is that I had a +5V symbol lurking spare after I rearranged everything. I can't see where it is missing from, if I did accidentally drag it when copying something to the new sheet. I did check PCB to schematic parity, and nothing showed up. So I still have zero idea where it came from. Maybe I had +5V attached to a single net twice.  Anyway, if there is going to be a problem with the next revision of the board, this is what I am going to guess will be a likely cause. Fortunately if that happens, it will be easy to bodge a blue wire to whatever pin and a pin with +5V.

So, I'm getting towards the end of the list of things I want to address. The very last thing I am thinking of doing for the C1565 port is to address what I see as a missed opportunity by Commodore at the time: The C1565 port has no +5V or +12V, which means that the external drive needs a separate power pack, unlike on the Amiga.  Modern 3.5" drives don't even need 12V, so we only need to include +5V.  But there are no spare pins in the mini-DIN8 connector. 

Fortunately there are 9-pin mini-DIN connectors, so I can just use one of those. This will even still allow use of a genuine C1565 drive if someone has one. All that they would need would be an adaptor cable. But as there are about 2 such drives in existence, I'm not really worried about even that inconvenience.  Switching the connector over wasn't particularly hard -- the biggest effort was finding a variant of the connector for which I could easily find a Kicad footprint.  This probably means I'm not using the cheapest version of this connector, but I can live with that. I might buy a couple of the cheapest ones, and confirm that they still fit in this footprint, as they should be identical. The datasheets look like this is the case.

What would be great for finally testing the C1565 is for me to design up my own C1565 compatible PCB to go with a standard PC 3.5" floppy drive. I might do that soon, but first I want to finish off this expansion board.

This means we are now up to adding that low-pass filter to the A/V output lines.  As discussed above, a 3rd order low-pass filter should be adequate, like this one:

The 500 Ohms represents our source, and the 75 Ohms the monitor it is connected to, so we don't need either of those. Just the two capacitors and the inductor between them. As a safety catch, I'm going to make the filter able to be bypassed, in case I stuff it up somehow. This will also let me real-time see how the filter performs, if I can easily switch it in and out by moving a jumper. This is how I am planning to hook each one up:

I say planning, because the board is getting a bit busy, and I want the filter switch jumpers to be accessible when the bridge board is fitted, and I don't want any signals crossing those JTAG lines that caused me trouble before. And, I'd ideally like all the filter switches in the same place. Oh, and I want to route lines for the audio pin on the A/V jack, too, so that the audio header on the MEGA65 R4+ boards can be fed directly to the A/V jack if desired.

Well, I succeeded in the end, but it wasn't easy, and I'm not totally comfortable that there won't be a lot of cross-talk on the A/V output lines that might compromise signal quality. But we will have to see.  But if the result is not too bad, then I'll be happy enough, as the idea of this board is that you can build it yourself, not get the most perfect picture and sound possible -- the digital video output and 3.5mm audio jacks are there for that.

Anyway, this is how the board looks now, with, I think, everything it needs:

In the end I couldn't get all the filter jumpers in the one place, so one of them is up to the left of the tape port, where it should be exposed even with the bridge board over the top.  The other two are to the left of the two mounting holes that are below the right-hand side of the user-port. Those are together, but are likely to be covered by the bridge board when fitted.  That doesn't greatly worry me, as after testing is done, those jumpers will either be permanently fitted (or not), and quite likely replaced with simple tracks instead.

Other things to observe on this revision of the board compared to the last one, include the movement of the floppy cable header down a bit as I discussed some time back to make the cable origami routing easier.  Also, just above the MEGA65 logo is the new 8-pin header to connect to the peripherals on the bridge board, primarily the ESP32 wifi adapter and to route the audio signals to the A/V jack.

Speaking of which, I have routed both left and right audio to the A/V jack, even though the C64 only supported mono on it. This is done by using the almost never used "audio in" pin on the C64 audio jack as a second audio output pin.

So the bridge board now looks like this:

I have contemplated making this board 4 layer instead of 2 layer, but I'm not sure that it is worth the extra cost for whatever benefit it might give for signal integrity, given that most of the signals are quite slow, or are simply being routed between pairs of connectors (like the TE0790 JTAG relay).  Si I'll leave it 2 layer for now.

So that just leaves the C1565 drive board before I can place my order for the revised PCBs, assuming I want to do that.  It should be a simple board to design, and with no tricky routing, so I'll give it a quick go, and if it takes too long, I'll just stop and place the order for the other boards.

Someone has already made a replica C1565 PCB that works with real C65s (except for formatting, because I believe the index hole sensor is required to know when to enable the write gate). However, they haven't released the schematic.  Also, Bo Zimmerman has the schematic for the original 1565 PCB. The main tid-bid from that is to know that it uses a 74LS125 to tri-state the F_READ line from the floppy drive when the drive is not selected.

I'll start by adding the 34-pin floppy data connector and 4-pin floppy power connector, and then the 9-pin mini-DIN. Actually, we can have 2 of the mini-DINs and allow daisy-chaining, like with the Commodore serial drives. The C65 ROMs don't support more than 2, but it doesn't mean that we can't. The main limitation will be the current draw from the cumulative circuits on the relatively small pins of the mini DIN connector.  But we may as well see what we can achieve.

Next we need to put a tri-state gate on the RDATA line via a 74LS125.  I'll use a 74HC85 4-bit comparator to check whether the C1565 unit is the currently selected one.  Those are active high, while the 74LS125 is active low, so I'll use a buffer from the 74LS125 to make an inverted driver for that. Or I could switch to an active high buffer like the 74LS126 -- but let's see what the other bits of logic we need require, before making decisions like that.

Next up we will need an 8-bit latch. Going back to our 1565 protocol diagram:

2.5.5. F011 Disk Expansion Port Serial Protocol

     +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +--
     | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CLK
   --+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+

    +---------------------------------+                     +--
    |                                 |                     |  LD
   -+                                 +---------------------+

   --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
     |LED|MOT|STP|DIR|SID|DS2|DS1|DS0|SPR|DKI|DKC|IND|PRT|TK0| SERIO
   --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-

We can see that we need to latch the data word on the falling edge of LD. But the 74LS595 latches on the rising edge. So we need to either invert LD, or find a latch chip that has a negative edge latch. We can probably invert LD using one of the spare buffers in the 74LS125, if we need to.  We also need to gate the SERIO line to only be driven by the drive when LD is low.  The 74LS165 to serialise the drive status bits also needs the opposite sense to work. It really is a pain, in fact, that LD is not the opposite to what they chose, as it would have been quite a bit simpler for the logic decoding. 

The existing replica only needs 4 chips, but that's because they are only doing a single bit of drive ID decoding. I'd still like to keep the option for more than 1 external drive.I could solve this by adding a 5th chip, e.g., a hex inverter, to allow me to invert this signal, but a whole chip to do that is a bit of a pain, when I am only exactly one gate short of what I need.

One option might be to use a 74LS139 2-to-4 decoder, as those have two such decoders, so I could still allow 4 drive IDs (probably the odd ones, 1, 3, 5 and 7), and then use the second one as a make-shift inverter. But maybe I'm just trying to be too darn clever, and should just get over myself and use the inverter.

I think I have a sensible schematic now, and the beginnings of a PCB layout. But then I realised that the reason I was looking at doing this now is because I wanted to be able to make sure the 1565 port would work.  That was an issue when I had some of the control lines on the ring buffers, which would have caused them to be much slower. But now I have it nicely directly connected to the FPGA, so that isn't an issue. 

In short, I can leave that for another time. Which is good, because I've just about run out of weekend. So I'm just going to give the expansion and bridge boards a last going over, and then generate the gerbers to send off for fabrication.

The PCBs have been ordered, so all that's left now is for me to order the last few parts that I don't have laying around, so that they can ship while the PCBs are being manufactured. Normally a 4-layer board takes about 5 days for PCBway to fabricate, and another 4 days or so to ship. So probably I'll have the boards in a couple of weekends time -- which is also about how long it takes for an order to ship from digikey in the USA, because they use FedEx who are a bit slower to get to Australia than DHL from China.

Well, that order is in now, too. So now it's hurry up an wait for a couple of weeks for everything to arrive, and see what silly mistakes I have made on these boards ;) But seriously, I'm hopeful that we are now on the home-run with this, and will have a nice functional and fun expansion board solution for the MEGA65 in the not too distant future... maybe even around the same time that the back-log of MEGA65s ships around the middle of the year...


Sunday 18 February 2024

More cartridge port fixes for the MEGA65

Some cartridges are still not working correctly on the MEGA65.  These issues seem to not be restricted to the new R5/R6 boards.  For example, Tiny Quest doesn't work reliably.  Thanks to Olivier, I now have this cartridge with me here at home, and have been investigating it for a couple of weeks on and off now.

The key issue that I have discovered, is that when I implemented the MEGA65's cartridge port controller, I was not allowing a "hold time" after the rising edge of the 1MHz clock on the cartridge port.  This means that cartridges that have multiple logic chips can have signal propagation times that mean that the value written to a register in a cartridge can be wrong.  This is a problem with the Tiny Quest cartridge.

Let's go through this from the ground up:

First, the 6502 datasheet gives us this timing diagram:

 

The important bits are DATA(READ) and DATA(WRITE) in particular.  We can see that a T_HR and T_HW delay is required after the Phi2 clock goes low.  This is to allow the logic in the CPU to accept and process the data.  The same applies to logic chips in a cartridge. We weren't implementing this hold time. They should be between 60 and 150ns according to the datasheet.

Now, the MEGA65 doesn't have a 6502 connected directly to the cartridge port bus, but cartridges are made with this kind of timing assumption in mind. Specifically, they are allowed to assume that the data lines will be valid for at least 60 ns after the relevant edge of the 1MHz clock on the cartridge port.

So let's take a look at the schematic of the Tiny Quest cartridge, which uses a HUCKY v1.03 64KB cartridge. The original description is in German, but it is also discussed in English here, including a reconstructed schematic:


The bit I am interested in primarily is the logic that listens for writes to $DE00, and selects which 8KB bank of the EPROM to use, and whether to disable the cartridge completely: Bits 0-2 are inverted bank select bits into the EPROM, and writing a 1 to bit 3 will cause the cartridge to completely disable itself until the /RESET line next goes low.  Here is the relevant part of the schematic enlarged:

If we focus even more on the logic for the cartridge control, it will be easier to read and to follow:

We can see that the 4 lowest data lines are connected to the D inputs of the 74LS173 chip (which is really just a latching buffer).  The outputs feed into the three 74LS04 inverters that go to the EPROM upper address bits (which we will ignore for now), and also directly to the /EXROM line -- this is the important and clever bit.

If the /EXROM line is low, the MEGA65 (or C64) knows that the cartridge is saying that it has an 8KB ROM at $8000-$9FFF.  The 74LS173 has all of its outputs low when the CLR line (pin 15) is high. This is why the /RESET line is fed through an inverter into pin 15 (called Mr in the schematic above), so that when /RESET is low causing the computer to reset, it is also resetting the cartridge to be enabled, and selecting bank 7 (because the bank select bits all zero, but fed through the inverters to make them all 1). So this ensures the cartridge is in a known specific state on power on.

The 74LS173 is also always outputting its values, because the two /OE lines are tied to ground. However, it only updates the latched values when the clock on pin 7 goes high -- this is formed by inverting the 1MHz PHI2 clock from the MEGA65: That is, it will latch the value on the falling edge of a PHI2 clock.

That's shown up the next bug I had: I was making the MEGA65's cartridge port write with hold on the rising edge of PHI2. So I'll fix that, make sure my tests pass, and then resynthesise.

This follows on work I had done earlier to completely refactor the cartridge port control logic, because it was quite a mess before.  I'll describe that while I wait for resynthesis to run, by annotating the VHDL:

The first part of the logic works out when the next edge of the 8MHz dotclock on the cartridge port should occur. It does this by adding a magic value to a 16 bit counter, and watching for when it overflows. This allows for quite accurate frequency generation. If there is no dotclock edge, then we do nothing and clear any strobe signals to the MEGA65's processor:

      ticker <= ('0'&ticker(15 downto 0)) + dotclock_increment;
      if ticker(16) = '0' then
        cart_access_read_strobe <= '0';
        cart_access_accept_strobe <= '0';

But if we have a dotclock edge, then we need to keep track of where we are in the 1MHz cycle: There are 8MHz/1MHz x 2 edges per cycle = 16 dotclock edges per 1MHz cycle, so we count from 0 to 15, to know where we are in the 1MHz clock cycle:
      else

        -- Each phi2_ticker increment is 1/16th of a 1MHz clock cycle,
        -- so about 64ns.
        if phi2_ticker /= 15 then
          phi2_ticker <= phi2_ticker + 1;
        else
          phi2_ticker <= 0;
        end if;
We then generate the actual dotclock signal on the cartridge port by alternating between 0 and 1 every edge:

        -- Create the 8MHz dotclock signal
        case phi2_ticker is
          when 0 | 2 | 4 | 6 | 8 | 10 | 12 | 14 => cart_dotclock <= '1';
          when others => cart_dotclock <= '0';
        end case;

Then we work out what to do eat each stage during the 1MHz clock cycle.  The 6502 bus in the C64 has Phi1 and Phi2 halves of the clock, and can handle two separate things happening each 1MHz cycle staged in this way. Normally the CPU is using the Phi2 half, and the VIC-II the Phi1 half (except when the VIC-II steals some Phi2 cycles). Because the VIC-II can only read and not write, most peripherals will respond to a read asynchronously at any time, but will only process a write when the Phi2 clock goes low, i.e., when a falling edge is seen on the Phi2 clock -- and it is the Phi2 clock that is visible on the cartridge port.

In our state-machine, we have the low-half of the Phi2 clock first, and then have it high in the second half. Now, this is actually a bit of an approximation, as the Phi2 clock isn't actually high for 50% of the cycle, but a bit less than that if you look at the 6502 timing diagram I included above.  But there is no harm in having Phi2 stretch to a full 50% duty-cycle. Rather, it makes the timing a bit more relaxed for cartridges. So let's look at what we do at each of the 16 stages of a 1MHz clock cycle.

We use a case statement to select what to do. In the first stage (we count starting from 0), we set the Phi2 clock signal low, causing the negative edge, and just do some tidying up after any read or write request:

        case phi2_ticker is
          when 0 =>
            cart_phi2 <= '0';
            if cart_read_in_progress='1' then
              complete_read_request := true;
            end if;
            cart_write_in_progress <= '0';

In the 2nd stage (#1) we do nothing at all, as we are just allowing an extra bit of time for the T_HW / T_HR.  By waiting 2 stages, each of which are about 63ns, we are waiting ~126ns, which is near the upper limit of what the 6502 timing diagram specifies, i.e., we are being as accommodating as possible:

          when 1 =>
            -- Allow longer hold time for writes
            null;
It is only in the 3rd stage (#2) where we actually release all the cartridge port lines, i.e., stop presenting the address and data value that we might have presented during a write operation. We also release all the /ROML, /ROMH, /IO1 and /IO2 lines as part of this. If we have a new read request, we can accept it now, as well. But we can't accept a write, because Phi2 is low, and the write needs to end on a falling edge, and you can't fall off the floor (handling write requests happens later):

          when 2 =>
            -- Release key bus lines after a short hold time, and start any new
            -- access we have under way, but only if we don't already have an
            -- access happening.
            if cart_read_in_progress = '0' and cart_write_in_progress='0' then
              do_release_lines := true;
              commence_any_pending_read_request := true;
            end if;

Then for the rest of the half-cycle we do nothing at all, i.e., just continue to hold any fresh read request on the bus to give the cartridge time to process the signals:

          when 3 | 4 | 5 | 6 | 7  =>
            -- We are in the middle of the low-half of a PHI2 cycle.
            -- We are either continuing a read or write, or idle.
            -- We don't start doing anything else.
            -- We _could_ start a read now, and satisfy all timing by waiting
            -- the correct number of phi2_ticker ticks, and thus get data back
            -- to the CPU a few cycles earlier, but the benefit is relatively
            -- small, and it might not be compatible with some cartridges.
            null;

It is only after this, when we are exactly half-way through that we do anything different. First, we need to set Phi1 to high. We also tidy up and conclude any read request we had running:

          when 8 =>
            -- Begin high-half of PHI2
            cart_phi2 <= '1';
            do_release_lines := true;
            if cart_read_in_progress='1' then
              complete_read_request := true;
            end if;

Then similarly to in the first-half of the cycle, we can start a new read request. But we can also start a write request, because we are in the high-half of the Phi2 cycle, and thus there will be a falling edge to mark the write:

          when 9 | 10 =>
            if cart_read_in_progress = '0' and cart_write_in_progress='0' then
              do_release_lines := true;
              commence_any_pending_read_request := true;
              commence_any_pending_write_request := true;
            end if;

Then just as in the first half, we just hang around to give the cartridge time to process things:

          when 11 | 12 | 13 | 14 =>
            -- We are in the middle of the high-half of a PHI2 cycle.
            -- We are either continuing a read or write, or idle.
            -- We don't start doing anything else.
            -- We could in theory start a read, but not a write, as there
            -- would not be enough time before the falling edge of PHI2.
            -- But as for the during the high-half, we don't want to implement
            -- any really weird timing.
            null;

And finally, we just do some general house-keeping, like keeping track of what we are doing with /RESET etc:

          when 15 =>
            -- End of cycle: Check if we need to update /RESET
            -- We assert reset on cartridge port for 15 phi2 cycles to give
            -- cartridge time to reset.
            if (reset_counter = 1) and (reset='1') then
              reset_counter <= 0;
            elsif reset_counter /= 0 then
              reset_counter <= reset_counter - 1;
            elsif reset_counter = 0 then
              if (not_joystick_cartridge = '1' and force_joystick_cartridge='0') or (disable_joystick_cartridge='1') then
                cart_reset <= reset and (not cart_force_reset);
                cart_reset_int <= reset and (not cart_force_reset);
                if cart_reset_int = '0' then
                  report "Releasing RESET on cartridge port";
                end if;
              end if;
            end if;

And in many ways that's really all there is to it.  Each of those variables that I set to true triggers a block that does the required action, each of which are quite simple, and don't really need to be listed here at the moment.  

Anyway, the resynthesis has finished, and it's still not working.  The problem I am seeing now is that the data lines are holding the value from a previous write, rather than the results of the read.  

I think what is happening here is that when I was setting the data lines to input, I was actually disabling input and output.  This was happening because the data lines go through a bidirectional buffer that has DIRection and /ENable lines. /EN has to be low to allow data to flow in any direction, and DIR has to be 1 for output and 0 for input. When I wanted to stop outputting I was setting DIR=0 and EN=1. That stops output alright, but it also stops input. As a result the FPGA pins were effectively isolated from the data lines, and any value previously output on those pins would persist in being visible for some time, until the charge on the FPGA pins was consumed. I've fixed that and am resynthesising it now.  

While that happens, I am curious to see if I can observe the charge dissipating. If so, it will give me more confidence that was really the problem: Nope. Hmm... that makes me think it must be something else. Indeed, after resynthesis the problem persists. I have added extra checks to my simulation tests to ensure that the data direction is correct etc, but it hasn't picked up anything.  I guess the next step will be attacking the cartridge breakout board with the oscilloscope again, to see if I can work out what is happening.

Let's start by seeing if pins 3 - 5 of the 74LS173 get updated when I write to $DE00. They should latch the values from bits 0 - 2. No action is visible, so let's check pins 9 and 10 are low. Pin 9 is the one connected to /EXROM that can go high if the cartridge is commanded to disable itself. And it is has indeed gone high -- so something has written to it that it has interpreted as a command to disable itself, i.e., had bit 3 set.   

It could be that the cartridge port never goes through the reset process on cold start, since triggering a reset does fix that problem.  Probing it on power-up confirms that this is indeed the case.  I'm not 100% sure why, but I'm suspecting it is because the 5V rail to the cartridge port takes longer to come up than the reset sequence trigger takes to happen, so the reset sequence happens, but without any measurable impact on the cartridge port. 

It looks like it can take perhaps as long as 20 ms for the DC-DC converter to get to voltage. We don't have a feed-back line for sensing when the voltage has stabilised, so we will just need to have an initial figure that is long enough. This would certainly explain the lack of reset to the cartridge on cold start, but working on warm resets. The reset timer is clocked at 81MHz, so 20ms = 0.02 x 81x10^6 = 1.62x10^6. I'll round that up to 2 million, just to be sure.

I'm hoping that will get the cold reset stuff sorted. But nope, now it's totally broken.  I think I have it fixed now, and resynthesising.

Meanwhile, although reset is broken, after I force /RESET to pulse low, I can write to the $DE00 bank select register, and I can read stuff from the cartridge -- and the banking is even working... but. The "but" is that it's not always reading the correct values.  While annoying, this is still a big step forward. So I'll put some attention on debugging the reading.

It looks like each location ends up reading either what I assume is the correct value, or some incorrect value -- that seems most of the time to be $20:


The above displays 10 attempts of reading $8000-$800F. As we can see, we end up reading $20 a lot of the time. I think it should be reading:

09 80 09 80 C3 C2 CD 38 30  8E 16 D0 20 A3 FD 20

Some bytes positions are more stable than others. 

I'm not yet sure what is happening here, but I'm suspecting that sometimes we are reading something other than the value from the cartridge data lines.  There doesn't seem to be any other variation apart from this spurious returning of $20 instead of the correct value, which makes me think that the timing is probably ok, as otherwise we would see multiple different values appearing.

So what on earth can be providing the $20 value?  I am a bit suspicious that to force a reset on the cartridge port that I had to write $20 to $7010000. But writing a different value to $7010000 doesn't change the value that gets read back.

Hmmm... I after resynthesising with the reset control fixes the $20 problem has disappeared, but now the banking via $DE00 seems to be broken again. All simulation tests are still passing.  Hmm... If I force a reset on the cartridge port, the cartridge is re-enabled, and responds to banking commands. So in a sense it's already working. But Tiny Quest still fails to start. But I think that's just a fundamental incompatibility of Tiny Quest to the MEGA65 core, so I'm happy to leave that at that.  

Let's just make sure that resynthesising the same VHDL again gives the same and predictable result, to make sure it isn't a random synthesis thing: That looks ok.

Let's just double check by trying Sam's Journey, that we expect to work fine.

Yup, that works, so I'm fairly confident that I have it all working well now.  It's a bit of a bummer that Tiny Quest doesn't work in the MEGA65 core, but it does work in the C64 core, so that's actually not a big problem.

In short, I think it's all working now. Next step is for the team to test it, and let me know if there are any problems... which is happening now.

In the process, I tried plugging in an EasyFlash 3 in, and while it doesn't seem to work always from cold start, after pressing RESET button, it now works again, which is a big improvement as it stopped working on MEGA65 cores a long time ago, and we hadn't had the opportunity to investigate the cause.  But now it works nicely, at least in terms of loading and showing the menu:


Although, that being said, it is still being a bit temperamental, so some further investigation will be required. But it is still progress.